Class Labeler

  • Direct Known Subclasses:
    BitSetLabeler, IntLabeler, LongLabeler

    public abstract class Labeler
    extends java.lang.Object
    Analogical Modeling uses labels composed of boolean vectors in order to group instances into subcontexts and subcontexts in supracontexts. Training set instances are assigned labels by comparing them with the instance to be classified and encoding matched attributes and mismatched attributes in a boolean vector. This class is used to assign context labels to training instances by comparison with the instance being classified.
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      protected static class  Labeler.Partition
      Simple class for storing index spans.
    • Constructor Summary

      Constructors 
      Constructor Description
      Labeler​(weka.core.Instance test, boolean ignoreUnknowns, MissingDataCompare mdc)  
    • Method Summary

      All Methods Static Methods Instance Methods Abstract Methods Concrete Methods 
      Modifier and Type Method Description
      abstract Label fromBits​(int labelBits)
      For testing purposes, this method allows the client to directly specify the label using the bits of an integer
      int getCardinality()  
      static int getCardinality​(weka.core.Instance testInstance, boolean ignoreUnknowns)
      Calculate the label cardinality for a given test instance
      java.util.List<java.lang.String> getContextList​(Label label, java.lang.String mismatchString)
      Returns a list representing the context.
      java.lang.String getContextString​(Label label)
      Returns a string representing the context.
      boolean getIgnoreUnknowns()  
      java.util.List<java.lang.String> getInstanceAttNamesList​(weka.core.Instance instance)  
      java.lang.String getInstanceAttsString​(weka.core.Instance instance, java.lang.String attDelimiter)
      Returns a string containing the attributes of the input instance (minus the class attribute and ignored attributes).
      java.util.List<java.lang.String> getInstanceAttValuesList​(weka.core.Instance instance)
      Returns a list containing the attributes of the input instance (minus the class attribute and ignored attributes).
      abstract Label getLatticeBottom()
      Creates and returns the label which belongs at the bottom of the boolean lattice formed by the subcontexts labeled by this labeler, i.e.
      abstract Label getLatticeTop()
      Creates and returns the label which belongs at the top of the boolean lattice formed by the subcontexts labeled by this labeler, i.e.
      MissingDataCompare getMissingDataCompare()  
      weka.core.Instance getTestInstance()  
      boolean isIgnored​(int index)
      Find if the attribute at the given index is ignored during labeling.
      abstract Label label​(weka.core.Instance data)
      Create a context label for the input instance by comparing it with the test instance.
      int numPartitions()  
      abstract Label partition​(Label label, int partitionIndex)
      In distributed processing, it is necessary to split labels into partitions.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • Labeler

        public Labeler​(weka.core.Instance test,
                       boolean ignoreUnknowns,
                       MissingDataCompare mdc)
        Parameters:
        test - Instance being classified
        ignoreUnknowns - true if attributes with undefined values in the test item should be ignored; false if not.
        mdc - Specifies how to compare missing attributes
    • Method Detail

      • getCardinality

        public int getCardinality()
        Returns:
        The cardinality of the generated labels, or how many instance attributes are considered during labeling.
      • getCardinality

        public static int getCardinality​(weka.core.Instance testInstance,
                                         boolean ignoreUnknowns)
        Calculate the label cardinality for a given test instance
        Parameters:
        testInstance - instance to assign labels
        ignoreUnknowns - true if unknown values are ignored; false otherwise
        Returns:
        the cardinality of labels generated from testInstance and ignoreUnknowns
      • getIgnoreUnknowns

        public boolean getIgnoreUnknowns()
        Returns:
        true if attributes with undefined values in the test item are ignored during labeling; false if not.
      • getMissingDataCompare

        public MissingDataCompare getMissingDataCompare()
        Returns:
        the MissingDataCompare strategy in use by this labeler
      • getTestInstance

        public weka.core.Instance getTestInstance()
        Returns:
        the test instance being used to label other instances
      • isIgnored

        public boolean isIgnored​(int index)
        Find if the attribute at the given index is ignored during labeling. The default behavior is to ignore the attributes with unknown values in the test instance if getIgnoreUnknowns() is true.
        Parameters:
        index - Index of the attribute being queried
        Returns:
        True if the given attribute is ignored during labeling; false otherwise.
      • label

        public abstract Label label​(weka.core.Instance data)
        Create a context label for the input instance by comparing it with the test instance.
        Parameters:
        data - Instance to be labeled
        Returns:
        the label for the context that the instance belongs to. The cardinality of the label will be the same as the test and data items. At any given index i, label.matches(i) will return true if that feature is the same in the test and data instances.
        Throws:
        java.lang.IllegalArgumentException - if the test and data instances are not from the same data set.
      • getContextString

        public java.lang.String getContextString​(Label label)
        Returns a string representing the context. If the input test instance attributes are "A C D Z R", and the label is 00101, then the return string will be "A C * Z *".
      • getContextList

        public java.util.List<java.lang.String> getContextList​(Label label,
                                                               java.lang.String mismatchString)
        Returns a list representing the context. If the input test instance attributes are "A C D Z R", the label is 00101, and the mismatchString is "*", then the return list will be "A", "C", "*", "Z", "*".
      • getInstanceAttsString

        public java.lang.String getInstanceAttsString​(weka.core.Instance instance,
                                                      java.lang.String attDelimiter)
        Returns a string containing the attributes of the input instance (minus the class attribute and ignored attributes).
      • getInstanceAttValuesList

        public java.util.List<java.lang.String> getInstanceAttValuesList​(weka.core.Instance instance)
        Returns a list containing the attributes of the input instance (minus the class attribute and ignored attributes).
      • getInstanceAttNamesList

        public java.util.List<java.lang.String> getInstanceAttNamesList​(weka.core.Instance instance)
      • getLatticeTop

        public abstract Label getLatticeTop()
        Creates and returns the label which belongs at the top of the boolean lattice formed by the subcontexts labeled by this labeler, i.e. the one for which every feature is a match.
        Returns:
        A label with all matches
      • getLatticeBottom

        public abstract Label getLatticeBottom()
        Creates and returns the label which belongs at the bottom of the boolean lattice formed by the subcontexts labeled by this labeler, i.e. the one for which every feature is a mismatch.
        Returns:
        A label with all mismatches
      • fromBits

        public abstract Label fromBits​(int labelBits)
        For testing purposes, this method allows the client to directly specify the label using the bits of an integer
      • partition

        public abstract Label partition​(Label label,
                                        int partitionIndex)
        In distributed processing, it is necessary to split labels into partitions. This method returns a partition for the given label. A full label is partitioned into pieces 0 through numPartitions(), so code to process labels in pieces should look like this:
                Label myLabel = myLabeler.label(myInstance);
                for(int i = 0; i < myLabeler.numPartitions(); i++)
                        process(myLabeler.partition(myLabel, i);
         
        Parameters:
        partitionIndex - index of the partition to return
        Returns:
        a new label representing a portion of the attributes represented by the input label.
        Throws:
        java.lang.IllegalArgumentException - if the partitionIndex is greater than numPartitions() or less than zero.
        java.lang.IllegalArgumentException - if the input label is not compatible with this labeler.