iglu.ir
Class PrecisionRecall

java.lang.Object
  |
  +--iglu.ir.PrecisionRecall

public class PrecisionRecall
extends java.lang.Object

A class that performs Precision/Recall testing. Returns results consistent with _Modern_Information_Retrieval_ by Baeza-Yates and Ribeiro-Neto, chapter 3. You pass it a DocumentSet, a set of queries, and a mapping from query to document, and a VectorCreator for indexing. The class will then conduct an indexing/querying activity to determing the precision/recall characteristics achieved by using the given vector creator.

This class creates vectors for the documents, put them into the SearchEngine, runs the queries, and produces precision/recall information. It's designed to compare vector creators, but it can also be used to experiment with results generated by changing any of the above.

By default it uses the Ryan Scherle's RAMSearchEngine. However, this is only good for small document sets. For larger document sets, you should create your own search engine and pass it it.

Version:
1.0
Author:
Travis Bauer

Field Summary
(package private)  IRPacket irPacket
           
(package private)  boolean noisy
           
(package private)  SearchEngine searchEngine
           
(package private)  VectorCreator vc
           
 
Constructor Summary
PrecisionRecall(IRPacket irPacket, VectorCreator vc, boolean index)
          Create a new PrecisionRecall object, using the RAMSearchEngine.
PrecisionRecall(SearchEngine searchEngine, IRPacket irPacket)
           
PrecisionRecall(SearchEngine searchEngine, IRPacket irPacket, VectorCreator vc, boolean index)
           
 
Method Summary
protected  void loadSearchEngine()
          Index the documents, insering them into the database
static void main(java.lang.String[] argv)
          A simple test of the PrecisionRecall class
 int[] relatedDocs(int docNum)
          Find all the documents related to the given document.
 void setNoisy(boolean n)
          Whether or not this class should print out its progress while running.
 PRResult[] testDocs(int[] docNums)
          Create vectors for the indicated documents in the document set and run those vectors as queries, returning results.
 PRResult[] testQueries()
          Test all the queries named in the constructor.
 PRResult testQuery(int queryID)
          Test a given query.
protected  PRResult testVector(TermVector v, int[] goodDocs, java.io.Serializable qid)
          Test a vector, assuming that goodDocs contain references to the relevant documents in the document set.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

irPacket

IRPacket irPacket

searchEngine

SearchEngine searchEngine

vc

VectorCreator vc

noisy

boolean noisy
Constructor Detail

PrecisionRecall

public PrecisionRecall(IRPacket irPacket,
                       VectorCreator vc,
                       boolean index)
Create a new PrecisionRecall object, using the RAMSearchEngine.

Parameters:
vc - A vector creator for indexing documents.
index - Whether or not to index and add docs to the database. This should normally be true, except in situations where you have persistent serach engines, have already indexed the data, and just want to run the queries.

PrecisionRecall

public PrecisionRecall(SearchEngine searchEngine,
                       IRPacket irPacket,
                       VectorCreator vc,
                       boolean index)

PrecisionRecall

public PrecisionRecall(SearchEngine searchEngine,
                       IRPacket irPacket)
Method Detail

loadSearchEngine

protected void loadSearchEngine()
Index the documents, insering them into the database


testQuery

public PRResult testQuery(int queryID)
Test a given query. queryID is an index into queries


testVector

protected PRResult testVector(TermVector v,
                              int[] goodDocs,
                              java.io.Serializable qid)
Test a vector, assuming that goodDocs contain references to the relevant documents in the document set. Name the query qid


testQueries

public PRResult[] testQueries()
Test all the queries named in the constructor.


testDocs

public PRResult[] testDocs(int[] docNums)
Create vectors for the indicated documents in the document set and run those vectors as queries, returning results. The assumption is that the relevant documents to retrieve are the documents which are relevant to any queries to which the given document is relevant.


relatedDocs

public int[] relatedDocs(int docNum)
Find all the documents related to the given document. A related document is one whichs maps to a query to which the document indicated by docNum matches as well.


setNoisy

public void setNoisy(boolean n)
Whether or not this class should print out its progress while running.


main

public static void main(java.lang.String[] argv)
A simple test of the PrecisionRecall class