iglu.ir
Interface SearchEngine

All Superinterfaces:
InformationSource
All Known Implementing Classes:
FileSearchEngine, RAMSearchEngine

public interface SearchEngine
extends InformationSource

An interface for a search engine. Store and retrieve document vectors and document data by document identifier. The content of the document identifier is up to the specific implementation. The document data stored may contain the document content in addition to any other identifying information that needs to be stored.. Each implementation of this interface is responsible for its own internal representations. Each class will probably have special initialization requirements and should provide appropriate methods there.

Author:
Ryan Scherle, Travis Bauer

Method Summary
 void addDocument(java.io.Serializable docId, java.io.Serializable docData, TermVector docVector)
          Add a vector to the collection.
 boolean docExists(java.io.Serializable docId)
          Returns true if a document with that ID is already in the database.
 boolean equals(java.lang.Object o)
          Indicates whether an object is equal to this SearchEngine
 java.lang.String getDescription()
          Returns a textual description of this information source.
 java.io.Serializable getDocData(java.io.Serializable docId)
          Returns the document data associated with docId.
 java.lang.String getMetricName()
          Returns the name of the similarity metric used by this class.
 java.lang.String getName()
          Returns the name of this particular source.
 double getSimilarityScore(TermVector vector1, TermVector vector2)
          Returns the similarity of the two vectors based on the metric indicated by getMetricName().
 TermVector getVector(java.io.Serializable docId)
          Get the vector for the given document.
 java.util.Iterator iterator()
          Returns an iterator over the document identifiers.
 ValueSortedMap retrieveDocuments(TermVector vector, int numSimilar)
          Return a list of document identifiers with documents similar to the given vector, sorted by similarity.
 void setDescription(java.lang.String description)
          Sets the description of this particular search engine
 void setDocData(java.io.Serializable docId, java.io.Serializable docData)
          Sets the document's data.
 void setName(java.lang.String name)
          Sets the name of this particular source.
 void setVector(java.io.Serializable docId, TermVector docVector)
          Change the vector for docId to the given vector.
 

Method Detail

getDescription

public java.lang.String getDescription()
Returns a textual description of this information source.

Specified by:
getDescription in interface InformationSource

setDescription

public void setDescription(java.lang.String description)
Sets the description of this particular search engine


getName

public java.lang.String getName()
Returns the name of this particular source.

Specified by:
getName in interface InformationSource

setName

public void setName(java.lang.String name)
Sets the name of this particular source.


getMetricName

public java.lang.String getMetricName()
Returns the name of the similarity metric used by this class.


getSimilarityScore

public double getSimilarityScore(TermVector vector1,
                                 TermVector vector2)
Returns the similarity of the two vectors based on the metric indicated by getMetricName().


equals

public boolean equals(java.lang.Object o)
Indicates whether an object is equal to this SearchEngine

Overrides:
equals in class java.lang.Object

addDocument

public void addDocument(java.io.Serializable docId,
                        java.io.Serializable docData,
                        TermVector docVector)
                 throws SearchEngineException
Add a vector to the collection. Throws an exception if some error occurs during document addition, including an error if docId is already in the engine.

SearchEngineException

setVector

public void setVector(java.io.Serializable docId,
                      TermVector docVector)
Change the vector for docId to the given vector.


getVector

public TermVector getVector(java.io.Serializable docId)
Get the vector for the given document. If the document id is not in the collection, return null.


getDocData

public java.io.Serializable getDocData(java.io.Serializable docId)
Returns the document data associated with docId.


setDocData

public void setDocData(java.io.Serializable docId,
                       java.io.Serializable docData)
Sets the document's data.


retrieveDocuments

public ValueSortedMap retrieveDocuments(TermVector vector,
                                        int numSimilar)
Return a list of document identifiers with documents similar to the given vector, sorted by similarity.

Specified by:
retrieveDocuments in interface InformationSource
Parameters:
numSimilar - The maximum number of documents to return. If 0, return all documents.
Returns:
A list of document identifiers, ordered by similarity.

docExists

public boolean docExists(java.io.Serializable docId)
Returns true if a document with that ID is already in the database.


iterator

public java.util.Iterator iterator()
Returns an iterator over the document identifiers. Iterates in a random order.