A set of classes for creating Information Retrieval programs.


Interface Summary
Document An interface for a document in the Information Retrieval Library.
DocumentFilter Applies a transformations to the indexible content of a Document.
DocumentSet An interface to a randomly accessible collection of documents.
EditableDocumentSet A simple extension of DocumentSet which lets you add documents.
InformationSource A generic interface for queryable information sources.
IRPacket An interface for information needed in Information Retrieval Tests.
SearchEngine An interface for a search engine.
VectorCreator An interface for classes that can create vectors for documents.

Class Summary
AbstractDocument Implements some of the basic Document methods using a TermVector.
AbstractIRPacket Provides some implementation independent utility methods for IRPackets.
AbstractVectorCreator Implements some of the functionality of the vector creator and provides a convienence method for making produced vectors conform to the user's limitations on them.
ASCIIDocument Represents a simple ascii document with no highlighted text.
DefaultDocumentSet Simple document set.
DefaultIRPacket A description of an information retrieval task.
DropLittleWords Drops any words of length maximumSize or below (default is three).
FileSearchEngine This is an implementation of the SearchEngine that uses the FileBTree.
FrequencyVectorCreator Sets keyword scores based on the frequency of the words.
GeneralVectorCreator Generates vectors using various standard indexing schemes.
HTMLDocument A document which can identify HTML Tags and pull out HTML info.
InvertedIndex Computes and stores an inverted index, which can be used by a search engine.
JDBCVectorCreator Generates TFIDF vectors from a document set.
OrderVectorCreator Sets keyword scores based on the order of word occurance.
PagedDocumentSet A document set backed by an FileObjectPager.
PhraseFrequencyVectorCreator Sets phrase scores based on the frequency of bigrams.
PorterStemmer This class implements the Porter stemming algorithm.
PrecisionRecall A class that performs Precision/Recall testing.
PRPair The results of a single query for a specific number of documents
PRResult The results of a PrecisionRecall experiment on a particular query.
PRResultUtils Utilities for working with PRResult objects.
RAMSearchEngine An implementation of the SearchEngine interface that keeps all intermediate information in memory.
StopList Implementation of a simple stoplist.
StreamFreqTable This class is intended to provide incremental TFIDF information when there is no sense of a "document".
TermVector A term vector is a mapping of all words in a language to values.
TermVectorTableModel A TableModel for displaying term vectors in a JTable.
TFIDFVectorCreator Generates TFIDF vectors from a document set.
VectorCreatorLibrary A convience class for retrieving provided vector creators.

Exception Summary
SearchEngineException An exception thrown by some SearchEngine

Package Description

A set of classes for creating Information Retrieval programs.