Parse


MetaMap Transfer
(MMTx)

Rosetta Stone: Metaphor for MetaMap/SKR work
Rosetta Stone

Home


Documentation


Prerequisites


2.4.A Prerequisites


Resources


Download
(Restricted)


Install


Run MMTx


Customize


Trouble Reporter


Review Status
of Trouble Reports


FAQ


Statistics


User's Group
Notes


Administration
(Restricted)
     

Usage:   java programs.Parse [Options]

This parser breaks sentences into phrases. This parser is a minimal committment barrier category parser.

The minimal commitment analysis assigns underspecified syntactic analysis to lexically analyzed input. The current emphasis is on noun phrases; however, the entire input string is bracketed. The analysis can be thought of as the result of skimming the input to extract only NP's and PrepP's.

The options include the following:


The prerequisites to run this program include:

CLASSPATH The classpath needs to include the installation path to $MMTx/mmtx/classes where $MMTX is the top level directory that contains the MMTx software.

The classpath needs to include the installation path to $MMTx/mmtx/config directory.

The classpath needs to include the path to MySQL's jdbc driver jar file, (or the path to your version of the jdbc driver, if different than MySQL). $MMTx/mmtx/mm.mysql.jdbc-1.2c directory.

(There exists an example cshrc with the correct class paths set in the $MMTX/mmtx/config directory).

Input and Ouput File Options

The default behavior is to read input from standard input and write to standard output. This program has a bias toward document based processing, it requires having the entire document seen prior to processing any of it. This is a polite way of saying that if input is coming from standard input, processing will not begin until an end of file has been send. Put another way, no processing happens from text put in from standard input, when a line-feed is seen. Processing only begins when the end of file marker has been seen: (A control-D) in the borne shell, a control-Z and carrage return in DOS.

Short Name Long Name Default Value Purpose
__ --fileName= stdIn Name of file to process
__ --outputFileName= stdOut Name of the outputFile to write to


Input Format Descriptions

The default behavior is to auto-detect MEDLINE Citation format or free text. The following flags overwrite this feature.

Short Name Long Name Default Value Purpose
__ --medlineCitations false The input is a collection of medLine citations
__ --mrcon false The input is a collection of MRCON rows
__ --freeText true The input is free text
__ --fieldedText false Is the input file/stdin fielded text?
__ --textField= 2 For fielded text, which field contains the text
__ --fieldSeparator= | For fielded text, what char is the separator


Output Display Options

Short Name Long Name Default Value Purpose
-T --tagger_output false Display tagger output
-p --plain_syntax true Display the phrases
-x --syntax false Display the MincoMan style output from the phrase extractor
__ --numberOfPhrases false Report the number of phrases the input has


Options to retrieve ever more levels of detail

Short Name Long Name Default Value Purpose
__ --collections false Display Collection information
__ --documents false Display Documents
__ --sections false Display Sections
__ --sentences false Display Sentences
__ --phrases false Display Phrases
__ --nps false Display Noun Phrases
__ --lexicalElements false Display Lexical Elements
__ --lexicalEntries false Display Lexical Entries
__ --tokens false Display tokens
__ --pipedOutput false Display in a pipe delimited format
__ --details false Display the goory details


Processing Options

Options of the underlying data model

Short Name Long Name Default Value Purpose
-t --tag_text false Tag the text (NOTE: When used on the command line, turns tagging off)
__ --lexicalLookup= 2 Lookup Algorithm options 1-3
__ --ambiguousAcronyms false Disambiguate sentence boundries using the acronyms and abbreviations file


Configuration Options

Short Name Long Name Default Value Purpose
__ --configName= mmtx cfg The name of the configuration file
-R --MMTX_ROOT= <installdir>/mmtx MMTX Root path
__ --MMTX_USERNAME= mmtxUser Database Account Name
__ --MMTX_HOSTNAME= <localhost> MMTX Root path
__ --ambiguousAcronymsFile= data/lexicon/
ambiguousAcronymsFile.txt
Location of the acronyms and abbreviations file needed in the tokenizer
__ --inflectionTable= inflStatic2001Lexicon The Lexicon's inflection table used
__ --lexiconVersion= Static2001Lexicon Lexicon Version
__ --nmm false Flag that flips between MetaMap output and non MetaMap output style. This flag is useful when combined with the --pipedOutput and display flags such as the --sentences, --phrases, --nps, --variants and other levels of detail.


Tagger Specific Options

Short Name Long Name Default Value Purpose
__ --useTagger true Use the tagger
__ --dontUseTagger false Don't use the tagger [Same as --tag_text]
-t --tag_text false [Don't] tag the text
__ --taggerMachineName= nls2 Tagger Server
__ --tagger= XeroxParc The name of the tagger that is hooked in
__ --taggerPortNumber= 1774 Tagger Server Port number


Last Modified: March 30, 2007 ii-public
Links to Our Sites
Indexing Initiative (II)
Investigating computer-assisted and fully automatic methodologies for indexing biomedical text. Includes the NLM Medical Text Indexer (MTI).
Semantic Knowledge Representation (SKR)
Develop programs to provide usable semantic representation of biomedical text. Includes the MetaMap and SemRep programs.
MetaMap Transfer (MMTx)
Distributable version of the MetaMap program.
Word Sense Disambiguation (WSD)
Test collection of manually curated MetaMap ambiguity resolution in support of word sense disambiguation research.
Medline Baseline Repository (MBR)
Static MEDLINE Baselines for use in research involving biomedical citations. Allows for query searches and test collection creation.
Picture of Lister Hill Center Lister Hill National Center for Biomedical Communications   NLM Logo U.S. National Library of Medicine   NIH Logo National Institutes of Health
DHHS Logo Department of Health and Human Services
     Contact Us    |   Copyright    |   Privacy    |   Accessibility    |   Freedom of Information Act    |   USA.gov