Custom Data Sets (Data File Builder)
An MMTx data set contains preprocessed UMLS data that MMTx references as it maps text to concepts. The data set determines the domain of strings that MMTx will be able to map and the range of target concepts. The Relaxed, Moderate, and Strict models you can download from this site are examples of the standard data sets.If you want to use MMTx to map text to the concepts in the entire UMLS Metathesaurus, then you will download one of the standard MMTx data sets before running the installation program.
If you have copyright issues or want to focus attention on selected Metathesaurus vocabularies, you will want to use MetamorphoSys to subset the Metathesaurus. If that is the case or you want to use MMTx to map text to another knowledge source, then you will need the Data File Builder and should install it. The users guide will explain how to use the Data File Builder to create a custom data set.
Data File Builder User's Guide
(67 kb)
Custom Part of Speech Tagger
An essential part of MMTx processing is its parsing of the input text. One piece of that parsing is the assignment of Part of Speech tags to tokens from a sentence. MMTx has an internal part of speech tagger but is designed to work with programs whose speciality is part of speech tagging. You can integrate a program with MMTx by building a tagger client.The "Notes on Tagger Integration" document provides guidance on how to train your Part of Speech tagger program and implement a tagger client.


