Package weka.classifiers.lazy.AM.label
Class Labeler
- java.lang.Object
-
- weka.classifiers.lazy.AM.label.Labeler
-
- Direct Known Subclasses:
BitSetLabeler
,IntLabeler
,LongLabeler
public abstract class Labeler extends java.lang.Object
Analogical Modeling uses labels composed of boolean vectors in order to group instances into subcontexts and subcontexts in supracontexts. Training set instances are assigned labels by comparing them with the instance to be classified and encoding matched attributes and mismatched attributes in a boolean vector. This class is used to assign context labels to training instances by comparison with the instance being classified.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description protected static class
Labeler.Partition
Simple class for storing index spans.
-
Constructor Summary
Constructors Constructor Description Labeler(weka.core.Instance test, boolean ignoreUnknowns, MissingDataCompare mdc)
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description abstract Label
fromBits(int labelBits)
For testing purposes, this method allows the client to directly specify the label using the bits of an integerint
getCardinality()
static int
getCardinality(weka.core.Instance testInstance, boolean ignoreUnknowns)
Calculate the label cardinality for a given test instancejava.util.List<java.lang.String>
getContextList(Label label, java.lang.String mismatchString)
Returns a list representing the context.java.lang.String
getContextString(Label label)
Returns a string representing the context.boolean
getIgnoreUnknowns()
java.util.List<java.lang.String>
getInstanceAttNamesList(weka.core.Instance instance)
java.lang.String
getInstanceAttsString(weka.core.Instance instance, java.lang.String attDelimiter)
Returns a string containing the attributes of the input instance (minus the class attribute and ignored attributes).java.util.List<java.lang.String>
getInstanceAttValuesList(weka.core.Instance instance)
Returns a list containing the attributes of the input instance (minus the class attribute and ignored attributes).abstract Label
getLatticeBottom()
Creates and returns the label which belongs at the bottom of the boolean lattice formed by the subcontexts labeled by this labeler, i.e.abstract Label
getLatticeTop()
Creates and returns the label which belongs at the top of the boolean lattice formed by the subcontexts labeled by this labeler, i.e.MissingDataCompare
getMissingDataCompare()
weka.core.Instance
getTestInstance()
boolean
isIgnored(int index)
Find if the attribute at the given index is ignored during labeling.abstract Label
label(weka.core.Instance data)
Create a context label for the input instance by comparing it with the test instance.int
numPartitions()
abstract Label
partition(Label label, int partitionIndex)
In distributed processing, it is necessary to split labels into partitions.
-
-
-
Constructor Detail
-
Labeler
public Labeler(weka.core.Instance test, boolean ignoreUnknowns, MissingDataCompare mdc)
- Parameters:
test
- Instance being classifiedignoreUnknowns
- true if attributes with undefined values in the test item should be ignored; false if not.mdc
- Specifies how to compare missing attributes
-
-
Method Detail
-
getCardinality
public int getCardinality()
- Returns:
- The cardinality of the generated labels, or how many instance attributes are considered during labeling.
-
getCardinality
public static int getCardinality(weka.core.Instance testInstance, boolean ignoreUnknowns)
Calculate the label cardinality for a given test instance- Parameters:
testInstance
- instance to assign labelsignoreUnknowns
- true if unknown values are ignored; false otherwise- Returns:
- the cardinality of labels generated from testInstance and ignoreUnknowns
-
getIgnoreUnknowns
public boolean getIgnoreUnknowns()
- Returns:
- true if attributes with undefined values in the test item are ignored during labeling; false if not.
-
getMissingDataCompare
public MissingDataCompare getMissingDataCompare()
- Returns:
- the MissingDataCompare strategy in use by this labeler
-
getTestInstance
public weka.core.Instance getTestInstance()
- Returns:
- the test instance being used to label other instances
-
isIgnored
public boolean isIgnored(int index)
Find if the attribute at the given index is ignored during labeling. The default behavior is to ignore the attributes with unknown values in the test instance ifgetIgnoreUnknowns()
is true.- Parameters:
index
- Index of the attribute being queried- Returns:
- True if the given attribute is ignored during labeling; false otherwise.
-
label
public abstract Label label(weka.core.Instance data)
Create a context label for the input instance by comparing it with the test instance.- Parameters:
data
- Instance to be labeled- Returns:
- the label for the context that the instance belongs to. The cardinality of the label will be the same as
the test and data items. At any given index i,
label.matches(i)
will return true if that feature is the same in the test and data instances. - Throws:
java.lang.IllegalArgumentException
- if the test and data instances are not from the same data set.
-
getContextString
public java.lang.String getContextString(Label label)
Returns a string representing the context. If the input test instance attributes are "A C D Z R", and thelabel
is00101
, then the return string will be "A C * Z *".
-
getContextList
public java.util.List<java.lang.String> getContextList(Label label, java.lang.String mismatchString)
Returns a list representing the context. If the input test instance attributes are "A C D Z R", thelabel
is00101
, and themismatchString
is "*", then the return list will be "A", "C", "*", "Z", "*".
-
getInstanceAttsString
public java.lang.String getInstanceAttsString(weka.core.Instance instance, java.lang.String attDelimiter)
Returns a string containing the attributes of the input instance (minus the class attribute and ignored attributes).
-
getInstanceAttValuesList
public java.util.List<java.lang.String> getInstanceAttValuesList(weka.core.Instance instance)
Returns a list containing the attributes of the input instance (minus the class attribute and ignored attributes).
-
getInstanceAttNamesList
public java.util.List<java.lang.String> getInstanceAttNamesList(weka.core.Instance instance)
-
getLatticeTop
public abstract Label getLatticeTop()
Creates and returns the label which belongs at the top of the boolean lattice formed by the subcontexts labeled by this labeler, i.e. the one for which every feature is a match.- Returns:
- A label with all matches
-
getLatticeBottom
public abstract Label getLatticeBottom()
Creates and returns the label which belongs at the bottom of the boolean lattice formed by the subcontexts labeled by this labeler, i.e. the one for which every feature is a mismatch.- Returns:
- A label with all mismatches
-
fromBits
public abstract Label fromBits(int labelBits)
For testing purposes, this method allows the client to directly specify the label using the bits of an integer
-
partition
public abstract Label partition(Label label, int partitionIndex)
In distributed processing, it is necessary to split labels into partitions. This method returns a partition for the given label. A full label is partitioned into pieces 0 throughnumPartitions()
, so code to process labels in pieces should look like this:Label myLabel = myLabeler.label(myInstance); for(int i = 0; i < myLabeler.numPartitions(); i++) process(myLabeler.partition(myLabel, i);
- Parameters:
partitionIndex
- index of the partition to return- Returns:
- a new label representing a portion of the attributes represented by the input label.
- Throws:
java.lang.IllegalArgumentException
- if the partitionIndex is greater thannumPartitions()
or less than zero.java.lang.IllegalArgumentException
- if the input label is not compatible with this labeler.
-
numPartitions
public int numPartitions()
- Returns:
- The number of label partitions available via
partition(weka.classifiers.lazy.AM.label.Label, int)
-
-