apollo.datamodel
Class FeatureSet

java.lang.Object
  extended by apollo.datamodel.Range
      extended by apollo.datamodel.SeqFeature
          extended by apollo.datamodel.FeatureSet
All Implemented Interfaces:
Comparable, FeatureSetI, RangeI, SeqFeatureI, TranslationI, java.io.Serializable, java.lang.Cloneable
Direct Known Subclasses:
AnnotatedFeature, StrandedFeatureSet

public class FeatureSet
extends SeqFeature
implements FeatureSetI

I think SeqFeature should implement all the functions of FeatureSet, so its polymorphic, so when you have a SeqFeatureI you dont have to care whether its a SeqFeature or a FeatureSet, so you dont have to keep downcasting to descend the datamodel (or do any other feature set functionality) As now you go through the Vector of SeqFeatures, cast to SeqFeature, then ask instanceof FeatureSetI and cast to FeatureSetI, which is potentially confusing.

See Also:
Serialized Form

Field Summary
protected  java.util.Vector features
           
protected  byte flags
          This flag is just for the use of bop to indicate that the feature has already had a an exon removed.
 java.lang.String genericReadThroughStopResidue
           
protected  SequenceI hitSequence
           
protected static org.apache.log4j.Logger logger
           
protected  int minus1_frameshift
           
protected  boolean missing_3prime
           
protected  boolean missing_5prime
           
protected  int plus1_frameshift
           
static byte POLYA_REMOVED
           
protected  java.lang.String readthrough_stop
           
protected static java.lang.String standard_start_codon
          sometimes the translation may have an unconventional start codon and we need to note this
protected  java.lang.String start_codon
           
protected  boolean trans_spliced
           
 
Fields inherited from class apollo.datamodel.SeqFeature
biotype, edit_offset_adjust, id, phase, properties, ref_features, refFeature, refId, score, scores
 
Fields inherited from class apollo.datamodel.Range
high, low, name, refSeq, strand, type
 
Fields inherited from interface apollo.datamodel.RangeI
NO_NAME, NO_TYPE
 
Constructor Summary
FeatureSet()
           
FeatureSet(FeatureList kids, java.lang.String name, java.lang.String type, int strand)
           
FeatureSet(FeatureSetI fs, java.lang.String class_name)
           
FeatureSet(int low, int high, java.lang.String type, int strand)
           
FeatureSet(SeqFeatureI sf)
           
FeatureSet(java.lang.String type, int strand)
           
 
Method Summary
 void accept(Visitor visitor)
          General implementation of Visitor pattern.
 void addFeature(SeqFeatureI feature)
          Add feature to end of features list, recalc low and high
 void addFeature(SeqFeatureI feature, boolean sort)
          no-op - overridden by FeatureSet
 void adjustEdges()
          Set low and high according to lowest and highest coord in kids
 void adjustEdges(SeqFeatureI span)
          If span has higher high and or lower low than current, reset high/low
 boolean beforeFivePrimeEnd(SeqFeatureI feature)
          Returns true if the feature passed in has a 3 prime end that is more 5prime than this feature.
 SequenceEdit[] buildEditList()
           
protected  SequenceEdit[] buildORFEditList()
           
 void calcTranslationStartForLongestPeptide()
          This sets the start at a standard start codon that gives the longest peptide (which may not be the first start codon).
 boolean canHaveChildren()
          This method determines if there are any child SeqFeatures in this set (FeatureSets are NOT included).
 void clearKids()
          by default no kids - no-op
 java.lang.Object clone()
          to get a field-by-field replica of this feature
 void deleteFeature(SeqFeatureI feature)
          The number of directly containd features.
 SeqFeatureI deleteFeatureAt(int i)
           
 FeatureList findFeaturesByAllNames(java.lang.String name)
          Searches recursively on both name and hit name
 FeatureList findFeaturesByAllNames(java.lang.String searchString, boolean useRegExp)
           
 FeatureList findFeaturesByAllNames(java.lang.String searchString, boolean useRegExp, boolean kidNamesOverParent)
          useRegExp is whether to search using pattern as a regular expression In fact, we ALWAYS do a RegExp search with the ORO pattern matchers.
 FeatureList findFeaturesByHitName(java.lang.String hname)
          Returns FeatureList of Features with hit name.
 FeatureList findFeaturesByName(java.lang.String name)
          Returns a FeatureList of all SeqFeatureIs that have this name empty if no features match.
 FeatureList findFeaturesByName(java.lang.String name, boolean kidNamesOverParent)
           
 void flipFlop()
          Overrides SeqFeature.flipFlop.
 java.lang.String get_cDNA()
          This needs to be fixed to account for edits to the genomic sequence, but I don't think it is urgent because this case is still so extremely rare.
 java.lang.String get_ORF(java.lang.String mRNA)
          will return an empty string if the translation start site has not been set.
protected  java.lang.String get_ORF(java.lang.String mRNA, int start_offset, int end_offset)
           
 SeqFeatureI getFeatureAt(int i)
          returns a seqfeature at the specified position
 SeqFeatureI getFeatureContaining(int position)
          Returns the FIRST feature in the set containing the position return this if none of children contain pos but FS does (intron) return null if doesnt contain position.
 int getFeatureIndex(SeqFeatureI sf)
          By default SeqFeature has no kids so returns -1 be default.
 int getFeaturePosition(int genomic_pos)
          This is an important method.
 java.util.Vector getFeatures()
          returns a vector of all the child features belonging to this feature.
 int getGenomicPosForPeptidePos(int peptidePosition)
          For a position in peptide coordinates get the corresponding genomic position For now just do this in FeatureSet, I could imagine having a peptide object
 int getGenomicPosition(int transcript_pos)
          Converts a transcript position (1 based without introns of course) to a genomic position
 SequenceI getHitSequence()
          if kids have hit feats then this is the hit seq associated with them This is a convenience for having to get hit seq from kids
 int getIndexContaining(int position)
           
 int getLastBaseOfStopCodon()
           
 FeatureList getLeafFeatsOver(int pos)
          This is used in the base editor to find the sub features that overlap a base with a sequence edit on it
 java.lang.String getName()
          In the case where the range is chromosomal the name is the chromosome name
 int getNumberOfDescendents()
          The number of descendants (direct and indirect) in this FeatureSetI.
 int getPeptidePosForGenomicPos(int genomicPosition)
           
 int getPositionFrom(int base_position, int base_offset)
           
 double getScore()
           
 SequenceEdit getSequencingErrorAtPosition(int base_position)
          any errors in the genomic sequence will apply to all of the transcripts for the gene
 int getSplicedLength()
           
 int getSplicedLength(int startExon, int endExon)
          This needs to be fixed to account for edits to the genomic sequence
 java.lang.String getSplicedTranscript(int startExon, int endExon)
           
 java.lang.String getStartAA()
           
 java.lang.String getStartCodon()
           
 TranslationI getTranslation()
          FeatureSetI itself implements TranslationI so just return self - in future this may be done with a separate translation object
 int getTranslationEnd()
           
 RangeI getTranslationRange()
          TranslationI interface
 int getTranslationStart()
          Returns start of translation in genomic coords
 boolean hasDescendents()
          returns true if the count of the number of leaf features (those that can't have child features themselves) is > 0.
 boolean hasNameBeenSet()
           
 boolean hasReadThroughStop()
           
 boolean hasTranslation()
          FeatureSets have translations.
 boolean hasTranslationEnd()
          Returns true if there is end of translation for the transcript, ie getTranslationEnd()!=0.
 boolean hasTranslationStart()
          Returns true if transcript has a translation start (!=0)
protected  void insertFeatureAt(SeqFeatureI feature, int position)
          Add feature(kid) to features list at position position.
 boolean isFlagSet(int mask)
          return the current state of the bit for this flag
 boolean isMissing3prime()
           
 boolean isMissing5prime()
          If true this means there is no real start codon - its missing, rename this isMissing5PrimeStart? or isMissingTranslationStart? hasTranslationStart() can be true while isMissing5prime is true - this means that theres a "contrived" start at the beginning of the transcript
 boolean isProteinCodingGene()
          Whoa! Why does this always return true? Are all FeatureSets really protein coding genes? Or is it just that this method is never called, because it is overridden in more specific classes? I agree - changing to return false - MG 11.21.05 Change, now return the value of isProteinCodingGene (this flag defaults to false) Cyril P 01.15.06
 boolean isSequencingErrorPosition(int base_position)
          any errors in the genomic sequence will apply to all of the transcripts for the gene
 boolean isTransSpliced()
           
 int minus1FrameShiftPosition()
           
 boolean pastThreePrimeEnd(SeqFeatureI feature)
          Returns true if the feature passed in has a 5 prime start that is located beyound the 3prime end of this feature.
 int plus1FrameShiftPosition()
           
 boolean rangeIsUnassigned()
          Return true if range has been assigned high & low
 int readThroughStopPosition()
           
 java.lang.String readThroughStopResidue()
           
 void setFlag(boolean state, byte mask)
           
 void setHitSequence(SequenceI seq)
          Is this an explicit alignment used by jalview? - the alternative to cigars? no I dont think it is - its used by the analysis adapters & game if kids have hit feats then this is the hit seq associated with them This is a convenience for having to get hit seq from kids
 boolean setMinus1FrameShiftPosition(int shift_pos)
           
 void setMissing3prime(boolean partial)
           
 void setMissing5prime(boolean partial)
           
 void setPeptideValidity(boolean validity)
          TranslationI method - no-op overridden by Transcript.
protected  void setPhases()
           
 boolean setPlus1FrameShiftPosition(int shift_pos)
          returns false of the frame shift position is unworkable
 void setProteinCodingGene(boolean isProteinCodingGene)
          A setter is needed for some featureSet, like gene prediction results, which can have translation start and stop whithout being instances of transcripts
 void setReadThroughStop(boolean readthrough)
          The generic version--set readthrough stop to true or false.
 void setReadThroughStop(java.lang.String residue)
           
 void setRefSequence(SequenceI seq)
          This is presently used to locate features that have a drawable.
 void setTranslationEnd(int pos)
           
 void setTranslationEndFromStart()
          Sets the translation end to the end of the ORF from the current translation start OR to the end of the last exon if no stop codons are found in phase.
 void setTranslationEndNoPhase(int pos)
           
 boolean setTranslationStart(int pos)
           
 boolean setTranslationStart(int pos, boolean set_end)
          If pos is not contained in FeatureSet, set trans start fails and false is returned, true on success.
 int size()
          FeatureSet overrides - merge with getNumberOfChildren
 void sort(int sortStrand)
           
 void sort(int sortStrand, boolean byLow)
          sort the child features of the set
 java.lang.String toString()
          For debugging
 java.lang.String translate()
          conceptually any piece of sequence may potentially be translated // this is useful to ascertain the potential of any given sequence // it may also be needed if a prediction program provides this // information // This method translates by extracting the coding pieces of // sequence from the exons to create a single string which // is then translated with phase 0.
 boolean unConventionalStart()
           
 boolean withinCDS(int pos)
          Returns true if the position is within the CDS of this feature
 
Methods inherited from class apollo.datamodel.SeqFeature
addDbXref, addProperty, addRefFeature, addScore, addScore, addScore, addScore, alignmentIsPeptide, amend_RNA, buildmRNAEditList, clearProperties, cloneFeature, compareTo, descendsFrom, getAlignment, getAnalogousOppositeStrandFeature, getAnnotatedFeature, getCigar, getCloneSource, getCodingDNA, getCodingProperties, getDatabase, getDbXref, getDbXrefs, getEndPhase, getExplicitAlignment, getFeatureSequence, getFrame, getGenomicErrors, getHend, getHhigh, getHitFeature, getHlow, getHname, getHstart, getHstrand, getId, getIdentifier, getNumberOfChildren, getParent, getPeptideSequence, getPhase, getPrimaryDbXref, getProgramName, getProperties, getPropertiesMulti, getProperty, getPropertyMulti, getProteinFeat, getRefFeature, getRefFeature, getRefId, getScore, getScores, getStrandedFeatSetAncestor, getSubSequence, getSyntenyLinkInfo, getTopLevelType, getUnpaddedAlignment, getUserObject, hasAlignable, hasAnalogousOppositeStrandFeature, hasAnnotatedFeature, hasCigar, hasHitFeature, hasId, hasKids, hasPeptideSequence, hasSyntenyLinkInfo, haveExplicitAlignment, haveRealAlignment, initWithSeqFeat, isAncestorOf, isAnnot, isAnnotTop, isClone, isCodon, isExon, isProtein, isSameFeat, isSequencingError, isTranscript, main, merge, na2aa, numberOfGenerations, parseCigar, removeProperty, replaceProperty, setAlignment, setAnalogousOppositeStrandFeature, setCigar, setDatabase, setExplicitAlignment, setId, setIdentifier, setPhase, setProgramName, setQueryFeature, setRefFeature, setScore, setSyntenyLinkInfo, setTopLevelType, setUserObject
 
Methods inherited from class apollo.datamodel.Range
contains, contains, convertFromBaseOrientedToInterbase, convertFromInterbaseToBaseOriented, getEnd, getEndAsString, getFeatureType, getHigh, getLeftOverlap, getLow, getRangeClone, getRefSequence, getResidues, getRightOverlap, getStart, getStartAsString, getStrand, hasFeatureType, hasName, hasRefSequence, isContainedByRefSeq, isExactOverlap, isForwardStrand, isIdentical, isSequenceAvailable, length, overlaps, sameRange, setEnd, setFeatureType, setHigh, setLow, setName, setStart, setStrand
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface apollo.datamodel.SeqFeatureI
addDbXref, addProperty, addScore, addScore, addScore, addScore, alignmentIsPeptide, clearProperties, cloneFeature, compareTo, descendsFrom, getAlignment, getAnalogousOppositeStrandFeature, getAnnotatedFeature, getCigar, getCloneSource, getCodingDNA, getCodingProperties, getDatabase, getDbXref, getDbXrefs, getEndPhase, getExplicitAlignment, getFeatureSequence, getFrame, getGenomicErrors, getHend, getHhigh, getHitFeature, getHlow, getHname, getHstart, getHstrand, getId, getNumberOfChildren, getParent, getPeptideSequence, getPhase, getProgramName, getProperties, getPropertiesMulti, getProperty, getPropertyMulti, getProteinFeat, getRefFeature, getRefId, getScore, getScores, getStrandedFeatSetAncestor, getSyntenyLinkInfo, getTopLevelType, getUnpaddedAlignment, getUserObject, hasAlignable, hasAnalogousOppositeStrandFeature, hasAnnotatedFeature, hasHitFeature, hasId, hasKids, hasPeptideSequence, hasSyntenyLinkInfo, haveExplicitAlignment, haveRealAlignment, isAncestorOf, isAnnot, isAnnotTop, isClone, isCodon, isExon, isProtein, isSameFeat, isSequencingError, isTranscript, merge, numberOfGenerations, parseCigar, removeProperty, replaceProperty, setAlignment, setAnalogousOppositeStrandFeature, setCigar, setDatabase, setExplicitAlignment, setId, setPhase, setProgramName, setQueryFeature, setRefFeature, setScore, setSyntenyLinkInfo, setTopLevelType, setUserObject
 
Methods inherited from interface apollo.datamodel.RangeI
contains, contains, convertFromBaseOrientedToInterbase, convertFromInterbaseToBaseOriented, getEnd, getFeatureType, getHigh, getLeftOverlap, getLow, getRangeClone, getRefSequence, getResidues, getRightOverlap, getStart, getStrand, hasFeatureType, hasName, hasRefSequence, isContainedByRefSeq, isExactOverlap, isForwardStrand, isIdentical, isSequenceAvailable, length, overlaps, sameRange, setEnd, setFeatureType, setHigh, setLow, setName, setStart, setStrand
 

Field Detail

logger

protected static final org.apache.log4j.Logger logger

POLYA_REMOVED

public static byte POLYA_REMOVED

standard_start_codon

protected static java.lang.String standard_start_codon
sometimes the translation may have an unconventional start codon and we need to note this


features

protected java.util.Vector features

hitSequence

protected SequenceI hitSequence

flags

protected byte flags
This flag is just for the use of bop to indicate that the feature has already had a an exon removed. Or for other simple flags that we don't want to allocate an entire boolean to.


plus1_frameshift

protected int plus1_frameshift

minus1_frameshift

protected int minus1_frameshift

readthrough_stop

protected java.lang.String readthrough_stop

genericReadThroughStopResidue

public java.lang.String genericReadThroughStopResidue

trans_spliced

protected boolean trans_spliced

missing_5prime

protected boolean missing_5prime

missing_3prime

protected boolean missing_3prime

start_codon

protected java.lang.String start_codon
Constructor Detail

FeatureSet

public FeatureSet()

FeatureSet

public FeatureSet(SeqFeatureI sf)

FeatureSet

public FeatureSet(java.lang.String type,
                  int strand)

FeatureSet

public FeatureSet(int low,
                  int high,
                  java.lang.String type,
                  int strand)

FeatureSet

public FeatureSet(FeatureList kids,
                  java.lang.String name,
                  java.lang.String type,
                  int strand)

FeatureSet

public FeatureSet(FeatureSetI fs,
                  java.lang.String class_name)
Method Detail

hasTranslation

public boolean hasTranslation()
FeatureSets have translations. It happens to be themselves - for now FeatSet implements all the TranslationI methods - in the future this might be farmed out to a separate Translation object

Specified by:
hasTranslation in interface SeqFeatureI
Overrides:
hasTranslation in class SeqFeature

getTranslation

public TranslationI getTranslation()
FeatureSetI itself implements TranslationI so just return self - in future this may be done with a separate translation object

Specified by:
getTranslation in interface SeqFeatureI
Overrides:
getTranslation in class SeqFeature

getName

public java.lang.String getName()
Description copied from interface: RangeI
In the case where the range is chromosomal the name is the chromosome name

Specified by:
getName in interface RangeI
Overrides:
getName in class Range

hasNameBeenSet

public boolean hasNameBeenSet()
Specified by:
hasNameBeenSet in interface FeatureSetI

size

public int size()
Description copied from class: SeqFeature
FeatureSet overrides - merge with getNumberOfChildren

Specified by:
size in interface SeqFeatureI
Overrides:
size in class SeqFeature

addFeature

public void addFeature(SeqFeatureI feature)
Add feature to end of features list, recalc low and high

Specified by:
addFeature in interface SeqFeatureI
Overrides:
addFeature in class SeqFeature

addFeature

public void addFeature(SeqFeatureI feature,
                       boolean sort)
Description copied from class: SeqFeature
no-op - overridden by FeatureSet

Specified by:
addFeature in interface SeqFeatureI
Overrides:
addFeature in class SeqFeature

insertFeatureAt

protected void insertFeatureAt(SeqFeatureI feature,
                               int position)
Add feature(kid) to features list at position position. Recalculate low and high


pastThreePrimeEnd

public boolean pastThreePrimeEnd(SeqFeatureI feature)
Returns true if the feature passed in has a 5 prime start that is located beyound the 3prime end of this feature. False is the feature is not 3prime of this feature


beforeFivePrimeEnd

public boolean beforeFivePrimeEnd(SeqFeatureI feature)
Returns true if the feature passed in has a 3 prime end that is more 5prime than this feature. False is the feature is not 5prime of this feature


deleteFeature

public void deleteFeature(SeqFeatureI feature)
Description copied from interface: FeatureSetI
The number of directly containd features.

Specified by:
deleteFeature in interface FeatureSetI

deleteFeatureAt

public SeqFeatureI deleteFeatureAt(int i)
Specified by:
deleteFeatureAt in interface FeatureSetI

getFeatureAt

public SeqFeatureI getFeatureAt(int i)
Description copied from class: SeqFeature
returns a seqfeature at the specified position

Specified by:
getFeatureAt in interface SeqFeatureI
Overrides:
getFeatureAt in class SeqFeature

getFeatures

public java.util.Vector getFeatures()
Description copied from class: SeqFeature
returns a vector of all the child features belonging to this feature. an empty vector is returned if the feature is unable to have children

Specified by:
getFeatures in interface SeqFeatureI
Overrides:
getFeatures in class SeqFeature

clearKids

public void clearKids()
Description copied from class: SeqFeature
by default no kids - no-op

Specified by:
clearKids in interface SeqFeatureI
Overrides:
clearKids in class SeqFeature

getLeafFeatsOver

public FeatureList getLeafFeatsOver(int pos)
Description copied from interface: SeqFeatureI
This is used in the base editor to find the sub features that overlap a base with a sequence edit on it

Specified by:
getLeafFeatsOver in interface SeqFeatureI
Overrides:
getLeafFeatsOver in class SeqFeature

adjustEdges

public void adjustEdges()
Set low and high according to lowest and highest coord in kids

Specified by:
adjustEdges in interface FeatureSetI

adjustEdges

public void adjustEdges(SeqFeatureI span)
If span has higher high and or lower low than current, reset high/low

Specified by:
adjustEdges in interface FeatureSetI

findFeaturesByHitName

public FeatureList findFeaturesByHitName(java.lang.String hname)
Returns FeatureList of Features with hit name. If parents alignement seq display id has name doesnt look at children This has pretty much been replaced by findFeaturesByAllNames, which checks hit names as well as regular names. So should this method be deleted, or is there still some use for it?

Specified by:
findFeaturesByHitName in interface FeatureSetI

findFeaturesByName

public FeatureList findFeaturesByName(java.lang.String name)
Returns a FeatureList of all SeqFeatureIs that have this name empty if no features match. This is changed from previous where it returned only one feature, the first feature with that name, but there could be more than one feature with that name couldnt there? Sometimes a parent FeatureSet has the same name as its children, I think if this is the case its probably uninteresting to get the children, so I think it should dig any deeper if it finds a match, but continue searching other branches.

Specified by:
findFeaturesByName in interface FeatureSetI

findFeaturesByName

public FeatureList findFeaturesByName(java.lang.String name,
                                      boolean kidNamesOverParent)
Specified by:
findFeaturesByName in interface FeatureSetI

findFeaturesByAllNames

public FeatureList findFeaturesByAllNames(java.lang.String name)
Searches recursively on both name and hit name

Specified by:
findFeaturesByAllNames in interface FeatureSetI

findFeaturesByAllNames

public FeatureList findFeaturesByAllNames(java.lang.String searchString,
                                          boolean useRegExp)
Specified by:
findFeaturesByAllNames in interface FeatureSetI

findFeaturesByAllNames

public FeatureList findFeaturesByAllNames(java.lang.String searchString,
                                          boolean useRegExp,
                                          boolean kidNamesOverParent)
useRegExp is whether to search using pattern as a regular expression In fact, we ALWAYS do a RegExp search with the ORO pattern matchers. However, if useRegExp is FALSE, we prepend "^" on the pattern, but we still allow user to specify a "*" wildcard, which we replace with ".*" for compatibility with RegExp. Note there is no longer any need to lower case names and patterns as the pattern will be compiled with the CASE_INSENSITIVE_MASK

Specified by:
findFeaturesByAllNames in interface FeatureSetI

setRefSequence

public void setRefSequence(SequenceI seq)
This is presently used to locate features that have a drawable. If the feature can't be found then the drawable is removed. Called after FeatureChangeEvent.SYNC occurs. And this happens when additional features are added - move to SeqFeatureUtil? rename containsFeature?

Specified by:
setRefSequence in interface RangeI
Overrides:
setRefSequence in class Range
Parameters:
seq - the new parent SequenceI

setHitSequence

public void setHitSequence(SequenceI seq)
Is this an explicit alignment used by jalview? - the alternative to cigars? no I dont think it is - its used by the analysis adapters & game if kids have hit feats then this is the hit seq associated with them This is a convenience for having to get hit seq from kids

Specified by:
setHitSequence in interface FeatureSetI

getHitSequence

public SequenceI getHitSequence()
if kids have hit feats then this is the hit seq associated with them This is a convenience for having to get hit seq from kids

Specified by:
getHitSequence in interface FeatureSetI
Specified by:
getHitSequence in interface SeqFeatureI
Overrides:
getHitSequence in class SeqFeature

getScore

public double getScore()
Specified by:
getScore in interface SeqFeatureI
Overrides:
getScore in class SeqFeature

getNumberOfDescendents

public int getNumberOfDescendents()
Description copied from class: SeqFeature
The number of descendants (direct and indirect) in this FeatureSetI. This method should find each child, and invoke numChildFeatures for each child that is a FeatureSetI, and add 1 to the count for all others. FeatureSetI implementors should not count themselves, but only the leaf SeqFeatureI implementations. This should be renamed numDescendants. numChild can lead one to think its its just the kids and not further descendants. In fact there should be 2 methods: numDescendants, numChildren

Specified by:
getNumberOfDescendents in interface SeqFeatureI
Overrides:
getNumberOfDescendents in class SeqFeature
Returns:
the number of features contained anywhere under this FeatureSetI

canHaveChildren

public boolean canHaveChildren()
This method determines if there are any child SeqFeatures in this set (FeatureSets are NOT included).

Specified by:
canHaveChildren in interface RangeI
Overrides:
canHaveChildren in class Range

hasDescendents

public boolean hasDescendents()
Description copied from interface: FeatureSetI
returns true if the count of the number of leaf features (those that can't have child features themselves) is > 0. That is, this feature isn't merely a collection of feature sets which are empty themselves

Specified by:
hasDescendents in interface FeatureSetI

getFeatureContaining

public SeqFeatureI getFeatureContaining(int position)
Returns the FIRST feature in the set containing the position return this if none of children contain pos but FS does (intron) return null if doesnt contain position.

Specified by:
getFeatureContaining in interface SeqFeatureI
Overrides:
getFeatureContaining in class SeqFeature

getIndexContaining

public int getIndexContaining(int position)
Specified by:
getIndexContaining in interface FeatureSetI

getFeatureIndex

public int getFeatureIndex(SeqFeatureI sf)
Description copied from class: SeqFeature
By default SeqFeature has no kids so returns -1 be default.

Specified by:
getFeatureIndex in interface SeqFeatureI
Overrides:
getFeatureIndex in class SeqFeature

sort

public void sort(int sortStrand)
Specified by:
sort in interface FeatureSetI

sort

public void sort(int sortStrand,
                 boolean byLow)
sort the child features of the set

Specified by:
sort in interface FeatureSetI
Parameters:
sortStrand - - sort by minus strand or positive/no strand
byLow - - sort by genomic position
See Also:
SeqFeatureUtil.sort()

translate

public java.lang.String translate()
Description copied from class: SeqFeature
conceptually any piece of sequence may potentially be translated // this is useful to ascertain the potential of any given sequence // it may also be needed if a prediction program provides this // information // This method translates by extracting the coding pieces of // sequence from the exons to create a single string which // is then translated with phase 0. The stop codon // is NOT included in the translated region when the // Translation start and stop have come from the fly XML data. translates get_ORF of cdna. get_ORF returns "" if no translation_start set, in this case the cdna is translated. Does phase need to be taken into account somehow?

Specified by:
translate in interface SeqFeatureI
Overrides:
translate in class SeqFeature

calcTranslationStartForLongestPeptide

public void calcTranslationStartForLongestPeptide()
This sets the start at a standard start codon that gives the longest peptide (which may not be the first start codon). The stop codon is also automatically set (if there is one). If a longer peptide results from starting at the beginning of the transcript (with missing 5 prime) then the start of the transcript is used and missing5Pime is true. Also sets stop - if there is no stop, missing 3 prime is set to true (& trans end is 0)

Specified by:
calcTranslationStartForLongestPeptide in interface TranslationI

setProteinCodingGene

public void setProteinCodingGene(boolean isProteinCodingGene)
A setter is needed for some featureSet, like gene prediction results, which can have translation start and stop whithout being instances of transcripts

Specified by:
setProteinCodingGene in interface FeatureSetI

isProteinCodingGene

public boolean isProteinCodingGene()
Whoa! Why does this always return true? Are all FeatureSets really protein coding genes? Or is it just that this method is never called, because it is overridden in more specific classes? I agree - changing to return false - MG 11.21.05 Change, now return the value of isProteinCodingGene (this flag defaults to false) Cyril P 01.15.06

Specified by:
isProteinCodingGene in interface FeatureSetI
Specified by:
isProteinCodingGene in interface SeqFeatureI
Overrides:
isProteinCodingGene in class SeqFeature
See Also:
setProteinCodingGene(boolean)

setTranslationEndFromStart

public void setTranslationEndFromStart()
Sets the translation end to the end of the ORF from the current translation start OR to the end of the last exon if no stop codons are found in phase. if missing stop then stop is set to 0 and missing3Prime is set to true.

Specified by:
setTranslationEndFromStart in interface TranslationI

setTranslationStart

public boolean setTranslationStart(int pos)
Specified by:
setTranslationStart in interface TranslationI

setTranslationStart

public boolean setTranslationStart(int pos,
                                   boolean set_end)
If pos is not contained in FeatureSet, set trans start fails and false is returned, true on success.

Specified by:
setTranslationStart in interface TranslationI

setTranslationEndNoPhase

public void setTranslationEndNoPhase(int pos)

setTranslationEnd

public void setTranslationEnd(int pos)
Specified by:
setTranslationEnd in interface TranslationI

getTranslationStart

public int getTranslationStart()
Returns start of translation in genomic coords

Specified by:
getTranslationStart in interface TranslationI

getTranslationEnd

public int getTranslationEnd()
Specified by:
getTranslationEnd in interface TranslationI

hasTranslationEnd

public boolean hasTranslationEnd()
Returns true if there is end of translation for the transcript, ie getTranslationEnd()!=0. Sometimes a valid stop codon does not exist for a transcript. This can happen with funny sequence data, like if theres a gap in the sequence where the stop is. Also in an intermediate editing state the exon with the stop codon might get deleted and the annotator has not added or extended an exon yet.

Specified by:
hasTranslationEnd in interface TranslationI

hasTranslationStart

public boolean hasTranslationStart()
Returns true if transcript has a translation start (!=0)

Specified by:
hasTranslationStart in interface TranslationI

getPositionFrom

public int getPositionFrom(int base_position,
                           int base_offset)
Specified by:
getPositionFrom in interface FeatureSetI

getLastBaseOfStopCodon

public int getLastBaseOfStopCodon()
Specified by:
getLastBaseOfStopCodon in interface TranslationI

get_cDNA

public java.lang.String get_cDNA()
Description copied from class: SeqFeature
This needs to be fixed to account for edits to the genomic sequence, but I don't think it is urgent because this case is still so extremely rare. If there are edits then the simple getResidues call is insufficient and needs post-processing

Specified by:
get_cDNA in interface SeqFeatureI
Overrides:
get_cDNA in class SeqFeature

getSplicedTranscript

public java.lang.String getSplicedTranscript(int startExon,
                                             int endExon)

getSplicedLength

public int getSplicedLength()
Specified by:
getSplicedLength in interface FeatureSetI

getSplicedLength

public int getSplicedLength(int startExon,
                            int endExon)
This needs to be fixed to account for edits to the genomic sequence


get_ORF

public java.lang.String get_ORF(java.lang.String mRNA)
will return an empty string if the translation start site has not been set. this is in contrast to a SeqFeature which returns whatever it can

Overrides:
get_ORF in class SeqFeature

get_ORF

protected java.lang.String get_ORF(java.lang.String mRNA,
                                   int start_offset,
                                   int end_offset)

getFeaturePosition

public int getFeaturePosition(int genomic_pos)
This is an important method. (SUZ) It tranforms a position that is in genomic coordinates into the equivalent position in feature coordinates. In simpler, more practical terms: If a there is a translation start site that is located on the genomic at position N, this returns where the translation start site would be on the mRNA (edited). So it starts from the beginning of the transcript (base 1 to the transcript and some bigger number on the genomic) and returns an offset relative to that 1, not the genomic.

Specified by:
getFeaturePosition in interface SeqFeatureI
Overrides:
getFeaturePosition in class SeqFeature

getGenomicPosition

public int getGenomicPosition(int transcript_pos)
Converts a transcript position (1 based without introns of course) to a genomic position

Specified by:
getGenomicPosition in interface SeqFeatureI
Overrides:
getGenomicPosition in class SeqFeature

getGenomicPosForPeptidePos

public int getGenomicPosForPeptidePos(int peptidePosition)
For a position in peptide coordinates get the corresponding genomic position For now just do this in FeatureSet, I could imagine having a peptide object

Specified by:
getGenomicPosForPeptidePos in interface SeqFeatureI
Overrides:
getGenomicPosForPeptidePos in class SeqFeature

getPeptidePosForGenomicPos

public int getPeptidePosForGenomicPos(int genomicPosition)

setPhases

protected void setPhases()

accept

public void accept(Visitor visitor)
General implementation of Visitor pattern. (see apollo.util.Visitor).

Specified by:
accept in interface SeqFeatureI
Overrides:
accept in class SeqFeature

isTransSpliced

public boolean isTransSpliced()

setMissing5prime

public void setMissing5prime(boolean partial)
Specified by:
setMissing5prime in interface TranslationI

isMissing5prime

public boolean isMissing5prime()
Description copied from interface: TranslationI
If true this means there is no real start codon - its missing, rename this isMissing5PrimeStart? or isMissingTranslationStart? hasTranslationStart() can be true while isMissing5prime is true - this means that theres a "contrived" start at the beginning of the transcript

Specified by:
isMissing5prime in interface TranslationI

setMissing3prime

public void setMissing3prime(boolean partial)
Specified by:
setMissing3prime in interface TranslationI

isMissing3prime

public boolean isMissing3prime()
Specified by:
isMissing3prime in interface TranslationI

unConventionalStart

public boolean unConventionalStart()
Specified by:
unConventionalStart in interface FeatureSetI

getStartAA

public java.lang.String getStartAA()
Specified by:
getStartAA in interface FeatureSetI

getStartCodon

public java.lang.String getStartCodon()
Specified by:
getStartCodon in interface FeatureSetI

hasReadThroughStop

public boolean hasReadThroughStop()
Specified by:
hasReadThroughStop in interface FeatureSetI

readThroughStopResidue

public java.lang.String readThroughStopResidue()
Specified by:
readThroughStopResidue in interface FeatureSetI

readThroughStopPosition

public int readThroughStopPosition()
Specified by:
readThroughStopPosition in interface FeatureSetI

setReadThroughStop

public void setReadThroughStop(boolean readthrough)
The generic version--set readthrough stop to true or false. If true, the readthrough residue is set to the genericReadThroughStopResidue.

Specified by:
setReadThroughStop in interface FeatureSetI

setReadThroughStop

public void setReadThroughStop(java.lang.String residue)
Specified by:
setReadThroughStop in interface FeatureSetI

isSequencingErrorPosition

public boolean isSequencingErrorPosition(int base_position)
any errors in the genomic sequence will apply to all of the transcripts for the gene

Specified by:
isSequencingErrorPosition in interface FeatureSetI

getSequencingErrorAtPosition

public SequenceEdit getSequencingErrorAtPosition(int base_position)
any errors in the genomic sequence will apply to all of the transcripts for the gene

Specified by:
getSequencingErrorAtPosition in interface FeatureSetI

flipFlop

public void flipFlop()
Overrides SeqFeature.flipFlop. flipFlops descendants as well via recursion. copied from berkeley_branch(old MAIN trunk)

Specified by:
flipFlop in interface SeqFeatureI
Overrides:
flipFlop in class SeqFeature

clone

public java.lang.Object clone()
to get a field-by-field replica of this feature

Specified by:
clone in interface SeqFeatureI
Overrides:
clone in class SeqFeature

getTranslationRange

public RangeI getTranslationRange()
TranslationI interface

Specified by:
getTranslationRange in interface TranslationI

setPeptideValidity

public void setPeptideValidity(boolean validity)
TranslationI method - no-op overridden by Transcript. dont think FeatureSet needs to implement as i think only annotation Transcripts lose can become invalidated, but if so we can always migrate Transcripts stuff here.

Specified by:
setPeptideValidity in interface TranslationI

buildEditList

public SequenceEdit[] buildEditList()
Specified by:
buildEditList in interface FeatureSetI

buildORFEditList

protected SequenceEdit[] buildORFEditList()

plus1FrameShiftPosition

public int plus1FrameShiftPosition()
Specified by:
plus1FrameShiftPosition in interface FeatureSetI

minus1FrameShiftPosition

public int minus1FrameShiftPosition()
Specified by:
minus1FrameShiftPosition in interface FeatureSetI

setPlus1FrameShiftPosition

public boolean setPlus1FrameShiftPosition(int shift_pos)
Description copied from interface: FeatureSetI
returns false of the frame shift position is unworkable

Specified by:
setPlus1FrameShiftPosition in interface FeatureSetI

setMinus1FrameShiftPosition

public boolean setMinus1FrameShiftPosition(int shift_pos)
Specified by:
setMinus1FrameShiftPosition in interface FeatureSetI

withinCDS

public boolean withinCDS(int pos)
Description copied from interface: FeatureSetI
Returns true if the position is within the CDS of this feature

Specified by:
withinCDS in interface FeatureSetI

isFlagSet

public boolean isFlagSet(int mask)
return the current state of the bit for this flag

Specified by:
isFlagSet in interface FeatureSetI

setFlag

public void setFlag(boolean state,
                    byte mask)
Specified by:
setFlag in interface FeatureSetI

toString

public java.lang.String toString()
For debugging

Overrides:
toString in class SeqFeature

rangeIsUnassigned

public boolean rangeIsUnassigned()
Return true if range has been assigned high & low

Specified by:
rangeIsUnassigned in interface RangeI
Overrides:
rangeIsUnassigned in class Range