Entrez
The Entrez (pronounced ɒnˈtreɪ[1]) Global Query Cross-Database Search System is a federated search engine, or web portal that allows users to search many discrete health sciences databases at the National Center for Biotechnology Information (NCBI) website.[2] The NCBI is a part of the National Library of Medicine (NLM), which is itself a department of the National Institutes of Health (NIH), which in turn is a part of the United States Department of Health and Human Services. The name "Entrez" (a greeting meaning "Come in" in French) was chosen to reflect the spirit of welcoming the public to search the content available from the NLM.
Entrez Global Query is an integrated search and retrieval system that provides access to all databases simultaneously with a single query string and user interface. Entrez can efficiently retrieve related sequences, structures, and references. The Entrez system can provide views of gene and protein sequences and chromosome maps. Some textbooks are also available online through the Entrez system.
Features
The Entrez front page provides, by default, access to the global query. All databases indexed by Entrez can be searched via a single query string, supporting boolean operators and search term tags to limit parts of the search statement to particular fields. This returns a unified results page, that shows the number of hits for the search in each of the databases, which are also links to actual search results for that particular database.
Entrez also provides a similar interface for searching each particular database and for refining search results. The Limits feature allows the user to narrow a search a web forms interface. The History feature gives a numbered list of recently performed queries. Results of previous queries can be referred to by number and combined via boolean operators. Search results can be saved temporarily in a Clipboard. Users with a MyNCBI account can save queries indefinitely and also choose to have updates with new search results e-mailed for saved queries of most databases. It is widely used in the field of biotechnology as a reference tool for students and professionals alike.
Databases
Entrez searches the following databases:
- PubMed: biomedical literature citations and abstracts, including Medline - articles from (mainly medical) journals, often including abstracts. Links to PubMed Central and other full-text resources are provided for articles from the 1990s.
- PubMed Central: free, full-text journal articles
- Site Search: NCBI web and FTP web sites
- Books: online books
- Online Mendelian Inheritance in Man (OMIM)
- Nucleotide: sequence database (GenBank)
- Protein: sequence database (GenPept)
- Genome: whole genome sequences and mapping
- Structure: three-dimensional macromolecular structures
- Taxonomy: organisms in GenBank Taxonomy
- dbSNP: single nucleotide polymorphism
- Gene:[3] gene-centered information
- HomoloGene: eukaryotic homology groups
- PubChem Compound: unique small molecule chemical structures
- PubChem Substance: deposited chemical substance records
- Genome Project: genome project information
- UniGene: gene-oriented clusters of transcript sequences
- CDD: conserved protein domain database
- PopSet: population study data sets (epidemiology)
- GEO Profiles: expression and molecular abundance profiles
- GEO DataSets: experimental sets of GEO data
- Sequence read archive: high-throughput sequencing data
- Cancer Chromosomes: cytogenetic databases
- PubChem BioAssay: bioactivity screens of chemical substances
- Probe: sequence-specific reagents
- NLM Catalog: NLM bibliographic data for over 1.2 million journals, books, audiovisuals, computer software, electronic resources, and other materials resident in LocatorPlus (updated every weekday).
Access
In addition to using the search engine forms to query the data in Entrez, NCBI provides the Entrez Programming Utilities[4] (eUtils) for more direct access to query results. The eUtils are accessed by posting specially formed URLs to the NCBI server, and parsing the XML response. There was also an eUtils SOAP interface which was terminated in July 2015.[5]
History
In 1991, Entrez was introduced in CD form. In 1993, a client-server version of the software provided connectivity with the internet. In 1994, NCBI established a website, and Entrez was a part of this initial release. In 2001, Entrez bookshelf was released and in 2003, the Entrez Gene database was developed.[6]
References
- "Definition of 'entrez'". Collins Dictionary [Internet].
- NCBI Resource Coordinators (2012). "Database resources of the National Center for Biotechnology Information". Nucleic Acids Research. 41 (Database issue): D8–D20. doi:10.1093/nar/gks1189. PMC 3531099. PMID 23193264.
- "Home - Gene - NCBI".
- Entrez Utilities. National Center for Biotechnology Information (US). 2010.
- The E-utility Web Service (SOAP). National Center for Biotechnology Information (US). 23 January 2015.
- Smith, Kent. "A Brief History of NCBI's Formation and Growth". The NCBI Handbook [Internet]. 2nd edition. Retrieved 3 May 2014.