Contact Us   
  Home > Background
We provide the following resources for each of the baselines for research purposes. Please note that background information on some of these resources is available from our MBR Reference Material page.

Resource Restrictions Where to Find
DTD Files: We save a copy of the relevant DTD (Document Type Definition) files each year for working with the Baseline XML files. No Restrictions MBR_Files
Frequency Count Files: Basic frequency counts for the entire MEDLINE/PubMed Baseline sorted into alphabetical and numerical order for the following MEDLINE fields. For all fields but the NM field, we also provide a sort and count of their occurrences as starred (Index Medicus) items.
     a. MH (MeSH Headings)
     b. SH (MeSH Subheadings)
     c. MH/SH combinations
     d. NM (Chemicals)
No Restrictions MBR_Files
Raw Data Files: Files containing the raw data similar to what was used to create our MBR Query Tool Database for this Baseline year. There is a README file describing the various files available and their layouts. No Restrictions MBR_Files
Histogram/Summary Files: File showing the number of MH terms assigned to each of the various MeSH Tree top-level and top-level + 1 categories during the latest year to see how assignment of terms vary from year to year.

File showing the number of MH terms assigned to each of the UMLS Semantic Type Groupings categories during the latest year to see how assignment of terms vary from year to year from a different perspective.
No Restrictions MBR_Files
Related MeSH Files: NEW for 2017: The MeSH FTP download site: ftp://nlmpubs.nlm.nih.gov/online/mesh/ now includes separate directories for each release year of MeSH. In addition, MeSH created the folder "MESH_FILES" with the latest release files that are updated every morning Monday - Friday. The yearly release folders span from 2011 to the latest full release which occurs in November of the preceding year (for example, 2016 MeSH was released in November of 2015). A single directory is also included for earlier files from 1999-2010. MeSH FTP Site MBR_Files
UMLS Semantic Groups File: We have saved a copy of the Semantic Groups file. The Semantic Groups are a coarse-grained set of semantic type groupings designed to reduce the complexity in the UMLS Metathesaurus. The 15 semantic groups provide a partition of the UMLS Metathesaurus for 99.5% of the concepts. No Restrictions MBR_Files
Unique Words from Medline Baseline: We use a very simplified idea of a word -- we throw away anything with all numbers, throw away anything with non-ascii characters, and break at anything that is not alphanumeric. The "words" files contains single words and bigram words. The bigram words are made up of a sliding window using the last "valid" word and the current word - so you get something like "last current" where we simply added a space. We also ignore a short (313) list of stop words, so they are not included in the various lists. Each of the "words" files also contains a frequency count for each item. Also, please note that we only look at the Title and Abstract fields to generate our list of words - we have ignored the MeSH Heading fields. No Restrictions MBR_Files

Copyright, Privacy, Accessibility, Viewers and Players,
Freedom of Information Act, Contact Us
Last Modified: May 31, 2017   
link to https://www.usa.gov/ - image is USA.gov logo link to https://www.hhs.gov - image is HHS.gov logo link to https://www.nih.gov - image is NIH.gov logo link to https://www.nlm.nih.gov - image spells out U.S. National Library of Medicine