|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object | +--iglu.ir.StopList
Implementation of a simple stoplist.
Field Summary | |
private java.util.HashSet |
wordSet
|
Constructor Summary | |
StopList()
Constructs a stoplist from the file stoplist.txt
in the resources directory of IGLU. |
|
StopList(java.util.Collection c)
Constructs a new stoplist from a Collection of terms. |
|
StopList(java.io.File list)
Constructs a new stoplist from the contents of a file. |
|
StopList(java.io.InputStream is)
Constructs a new stoplist from the input stream, which is assumed to have one stopword on each line. |
|
StopList(java.io.Reader r)
Constructs a new stoplist from the Reader, which is assumed to provide one stopword on each line of input. |
Method Summary | |
void |
applyFilter(Document d)
Filters all stopwords out of a a document's indexible content. |
boolean |
contains(java.lang.String word)
Returns true if the word is in the stoplist. |
java.util.HashSet |
getList()
Returns the list of stopwords. |
private void |
initializeFromReader(java.io.Reader r)
Adds words to the stoplist from a Reader. |
static void |
main(java.lang.String[] args)
Tests to see if a word is in the stoplist. |
java.lang.String |
processText(java.lang.String string)
Returns the string with the stop words dropped out. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
private java.util.HashSet wordSet
Constructor Detail |
public StopList()
stoplist.txt
in the resources directory of IGLU.
public StopList(java.io.InputStream is)
public StopList(java.io.Reader r)
public StopList(java.io.File list)
public StopList(java.util.Collection c)
Method Detail |
private void initializeFromReader(java.io.Reader r)
public boolean contains(java.lang.String word)
public java.lang.String processText(java.lang.String string)
string
- a String containing words. This class assumes that
the punctuation has already been dropped and that the words are
separated by spaces.
public java.util.HashSet getList()
public void applyFilter(Document d)
applyFilter
in interface DocumentFilter
d
- a Document
valuepublic static void main(java.lang.String[] args)
stoplist.txt
in the user's lib
directory.
|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |