iglu.ir
Class RAMSearchEngine

java.lang.Object
  |
  +--iglu.ir.RAMSearchEngine
All Implemented Interfaces:
InformationSource, SearchEngine, java.io.Serializable

public class RAMSearchEngine
extends java.lang.Object
implements SearchEngine, java.io.Serializable

An implementation of the SearchEngine interface that keeps all intermediate information in memory. This class is best used for building search-engine-like structures that will hold a small number of documents (under 1000). The data stored by this search engine is not inherently persistent. To keep the information between program runs, you must serialize this object.

Author:
Ryan Scherle
See Also:
Serialized Form

Field Summary
private  java.lang.String description
           
private  java.util.HashMap idDataMap
           
private  java.util.HashMap idVectorMap
           
private  java.lang.String metricName
           
private  java.lang.String name
           
 
Constructor Summary
RAMSearchEngine()
           
 
Method Summary
 void addDocument(java.io.Serializable docId, java.io.Serializable docData, TermVector docVector)
          Add a vector to the collection.
 boolean docExists(java.io.Serializable docId)
          Returns true if a document with that ID is already in the database.
 boolean equals(java.lang.Object o)
          Indicates whether an object is equal to this SearchEngine
 java.lang.String getDescription()
          Returns a textual description of this information source.
 java.io.Serializable getDocData(java.io.Serializable docId)
          Returns the document data associated with docId.
 java.lang.String getMetricName()
          Returns the name of the similarity metric used by this class.
 java.lang.String getName()
          Returns the name of this particular source.
 double getSimilarityScore(TermVector vector1, TermVector vector2)
          Returns the similarity of the two vectors based on the metric indicated by getMetricName().
 TermVector getVector(java.io.Serializable docId)
          Get the vector for the given document.
 int hashCode()
          Provides a hash code for this SearchEngine.
 java.util.Iterator iterator()
          Returns an iterator over the document identifiers.
static void main(java.lang.String[] args)
          Runs some tests on this class.
 ValueSortedMap retrieveDocuments(TermVector vector, int numSimilar)
          Return a list of document identifiers with documents similar to the given vector, sorted by similarity.
 void setDescription(java.lang.String description)
          Sets the description of this particular search engine
 void setDocData(java.io.Serializable docId, java.io.Serializable docData)
          Sets the document's data.
 void setName(java.lang.String name)
          Sets the name of this particular source.
 void setVector(java.io.Serializable docId, TermVector docVector)
          Change the vector for docId to the given vector.
 int size()
          Returns the number of documents in this engine.
static void test()
          Runs some tests on this class.
 java.lang.String toString()
          Returns a string indicating the type and size of this engine.
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Field Detail

description

private java.lang.String description

name

private java.lang.String name

metricName

private java.lang.String metricName

idDataMap

private java.util.HashMap idDataMap

idVectorMap

private java.util.HashMap idVectorMap
Constructor Detail

RAMSearchEngine

public RAMSearchEngine()
Method Detail

getDescription

public java.lang.String getDescription()
Returns a textual description of this information source.

Specified by:
getDescription in interface SearchEngine

setDescription

public void setDescription(java.lang.String description)
Sets the description of this particular search engine

Specified by:
setDescription in interface SearchEngine

getName

public java.lang.String getName()
Returns the name of this particular source.

Specified by:
getName in interface SearchEngine

setName

public void setName(java.lang.String name)
Sets the name of this particular source.

Specified by:
setName in interface SearchEngine

getMetricName

public java.lang.String getMetricName()
Returns the name of the similarity metric used by this class.

Specified by:
getMetricName in interface SearchEngine

getSimilarityScore

public double getSimilarityScore(TermVector vector1,
                                 TermVector vector2)
Returns the similarity of the two vectors based on the metric indicated by getMetricName(). For best efficiency, vector1 should be the shorter of the two vectors.

Specified by:
getSimilarityScore in interface SearchEngine

equals

public boolean equals(java.lang.Object o)
Indicates whether an object is equal to this SearchEngine

Specified by:
equals in interface SearchEngine
Overrides:
equals in class java.lang.Object

hashCode

public int hashCode()
Provides a hash code for this SearchEngine.

Overrides:
hashCode in class java.lang.Object

size

public int size()
Returns the number of documents in this engine.


toString

public java.lang.String toString()
Returns a string indicating the type and size of this engine.

Overrides:
toString in class java.lang.Object

addDocument

public void addDocument(java.io.Serializable docId,
                        java.io.Serializable docData,
                        TermVector docVector)
                 throws SearchEngineException
Add a vector to the collection. Throws an exception if some error occurs during document addition, including an error if docId is already in the engine.

Specified by:
addDocument in interface SearchEngine
SearchEngineException

setVector

public void setVector(java.io.Serializable docId,
                      TermVector docVector)
Change the vector for docId to the given vector.

Specified by:
setVector in interface SearchEngine

getVector

public TermVector getVector(java.io.Serializable docId)
Get the vector for the given document. If the document id is not in the collection, return null.

Specified by:
getVector in interface SearchEngine

getDocData

public java.io.Serializable getDocData(java.io.Serializable docId)
Returns the document data associated with docId.

Specified by:
getDocData in interface SearchEngine
Parameters:
docId - an identifier of the document.

setDocData

public void setDocData(java.io.Serializable docId,
                       java.io.Serializable docData)
Sets the document's data.

Specified by:
setDocData in interface SearchEngine

retrieveDocuments

public ValueSortedMap retrieveDocuments(TermVector vector,
                                        int numSimilar)
Return a list of document identifiers with documents similar to the given vector, sorted by similarity.

Specified by:
retrieveDocuments in interface SearchEngine
Parameters:
numSimilar - The maximum number of documents to return. If 0, return all documents.
Returns:
A list of document identifiers, ordered by similarity.

docExists

public boolean docExists(java.io.Serializable docId)
Returns true if a document with that ID is already in the database.

Specified by:
docExists in interface SearchEngine

iterator

public java.util.Iterator iterator()
Returns an iterator over the document identifiers. Iterates in a random order.

Specified by:
iterator in interface SearchEngine

test

public static void test()
Runs some tests on this class. No output means everything is working correctly.


main

public static void main(java.lang.String[] args)
Runs some tests on this class. No output means everything is working correctly.