|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object | +--iglu.examples.DirectorySearchEngine
An example for using the iglu.ir
package. This class
creates a simple search engine from the files in a directory
structure, assuming that the files are all text documents. It is
rather inefficient, since it must rebuild the index every time it
is used, but this simplicity makes it an easy example.
Field Summary | |
private RAMSearchEngine |
engine
|
private ValueSortedMap |
searchResults
|
private StopList |
stopList
|
private TFIDFVectorCreator |
vecMaker
|
Constructor Summary | |
DirectorySearchEngine()
Initializes the the vector creator (which collects keywords from the files and assigns weights to them), and search engine (which allows the keywords to be stored and retrieved). |
Method Summary | |
void |
doSearch(java.io.File directory,
TermVector query)
Indexes all files in the directory, and sends the query to the engine. |
private void |
indexFile(java.io.File file,
boolean indexing)
Add a single file to the search engine. |
static void |
main(java.lang.String[] args)
Runs the search engine. |
private void |
printResults(TermVector query,
ValueSortedMap results)
Prints out the list of files that match the query. |
private void |
processFiles(java.io.File file,
boolean indexing)
Goes through each file in the directory, passing the files to indexFile . |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
private RAMSearchEngine engine
private TFIDFVectorCreator vecMaker
private ValueSortedMap searchResults
private StopList stopList
Constructor Detail |
public DirectorySearchEngine()
Method Detail |
public void doSearch(java.io.File directory, TermVector query)
stdout
.
private void processFiles(java.io.File file, boolean indexing)
indexFile
.
private void indexFile(java.io.File file, boolean indexing)
indexing
is true
, indicating that the
file's information should be counted to provide the statistical
information for the TFIDF algorithm. The second time,
indexing
is false
, indicating that the
term vector is actually added to the search engine.
private void printResults(TermVector query, ValueSortedMap results)
public static void main(java.lang.String[] args)
|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |