Semantic Scholar
Semantic Scholar is an artificial intelligence–powered research tool for scientific literature developed at the Allen Institute for AI and publicly released in November 2015.[1] It uses advances in natural language processing to provide summaries for scholarly papers.[2] The Semantic Scholar team is actively researching the use of artificial-intelligence in natural language processing, machine learning, Human-Computer interaction, and information retrieval.[3]
Type of site | Search engine |
---|---|
Created by | Allen Institute for Artificial Intelligence |
URL | semanticscholar |
Launched | November 2015 |
Semantic Scholar began as a database surrounding the topics of computer science, geoscience, and neuroscience.[4] However, in 2017 the system began including biomedical literature in its corpus.[4] As of September 2022, they now include over 200 million publications from all fields of science.[5]
Technology
Semantic Scholar provides a one-sentence summary of scientific literature. One of its aims was to address the challenge of reading numerous titles and lengthy abstracts on mobile devices.[6] It also seeks to ensure that the three million scientific papers published yearly reach readers, since it is estimated that only half of this literature are ever read.[7]
Artificial intelligence is used to capture the essence of a paper, generating it through an "abstractive" technique.[2] The project uses a combination of machine learning, natural language processing, and machine vision to add a layer of semantic analysis to the traditional methods of citation analysis, and to extract relevant figures, tables, entities, and venues from papers.[8][9]
In contrast with Google Scholar and PubMed, Semantic Scholar is designed to highlight the most important and influential elements of a paper.[10] The AI technology is designed to identify hidden connections and links between research topics.[11] Like the previously cited search engines, Semantic Scholar also exploits graph structures, which include the Microsoft Academic Knowledge Graph, Springer Nature's SciGraph, and the Semantic Scholar Corpus.[12]
Each paper hosted by Semantic Scholar is assigned a unique identifier called the Semantic Scholar Corpus ID (abbreviated S2CID). The following entry is an example:
Semantic Scholar is free to use and unlike similar search engines (i.e. Google Scholar) does not search for material that is behind a paywall.[13][4]
One study compared the search abilities of Semantic Scholar through a systematic approach, and found the search engine to be 98.88% accurate when attempting to uncover the data.[13] The same study examined other Semantic Scholar functions, including tools to survey metadata as well as several citation tools.[13]
Number of users and publications
As of January 2018, following a 2017 project that added biomedical papers and topic summaries, the Semantic Scholar corpus included more than 40 million papers from computer science and biomedicine.[14] In March 2018, Doug Raymond, who developed machine learning initiatives for the Amazon Alexa platform, was hired to lead the Semantic Scholar project.[15] As of August 2019, the number of included papers metadata (not the actual PDFs) had grown to more than 173 million[16] after the addition of the Microsoft Academic Graph records.[17] In 2020, a partnership between Semantic Scholar and the University of Chicago Press Journals made all articles published under the University of Chicago Press available in the Semantic Scholar corpus.[18] At the end of 2020, Semantic Scholar had indexed 190 million papers.[19]
In 2020, users of Semantic Scholar reached seven million a month.[6]
See also
- Citation analysis – Examination of the frequency, patterns, and graphs of citations in documents
- Citation index – Index of citations between publications
- Knowledge extraction – Creation of knowledge from structured and unstructured sources
- List of academic databases and search engines
- Scientometrics – Study of measuring and analysing science, technology and innovation
References
- Eunjung Cha, Ariana (3 November 2015). "Paul Allen's AI research group unveils program that aims to shake up how we search scientific knowledge. Give it a try". The Washington Post. Archived from the original on 6 November 2019. Retrieved November 3, 2015.
- Hao, Karen (November 18, 2020). "An AI helps you summarize the latest in AI". MIT Technology Review. Retrieved 2021-02-16.
- "Semantic Scholar Research". research.semanticscholar.org. Retrieved 2021-11-22.
- Fricke, Suzanne (2018-01-12). "Semantic Scholar". Journal of the Medical Library Association. 106 (1): 145–147. doi:10.5195/jmla.2018.280. ISSN 1558-9439. S2CID 45802944.
- Matthews, David (1 September 2021). "Drowning in the literature? These smart software tools can help". Nature. Retrieved 5 September 2022.
...the publicly available corpus compiled by Semantic Scholar — a tool set up in 2015 by the Allen Institute for Artificial Intelligence in Seattle, Washington — amounting to around 200 million articles, including preprints.
- Grad, Peter (November 24, 2020). "AI tool summarizes lengthy papers in a sentence". Tech Xplore. Retrieved 2021-02-16.
- "Allen Institute's Semantic Scholar now searches across 175 million academic papers". VentureBeat. 2019-10-23. Retrieved 2021-02-16.
- Bohannon, John (11 November 2016). "A computer program just ranked the most influential brain scientists of the modern era". Science. doi:10.1126/science.aal0371. Archived from the original on 29 April 2020. Retrieved 12 November 2016.
- Christopher Clark; Santosh Divvala (2016). PDFFigures 2.0: Mining figures from research papers. Proceedings of the 16th ACM/IEEE-CS Joint Conference on Digital Libraries. ISBN 978-1-4503-4229-2. Wikidata Q108172042.
- "Semantic Scholar". International Journal of Language and Literary Studies. Retrieved 2021-11-09.
- Baykoucheva, Svetla (2021). Driving Science Information Discovery in the Digital Age. Chandos Publishing. p. 91. ISBN 978-0-12-823724-3.
- Jose, Joemon M.; Yilmaz, Emine; Magalhães, João; Castells, Pablo; Ferro, Nicola; Silva, Mário J.; Martins, Flávio (2020). Advances in Information Retrieval: 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, April 14–17, 2020, Proceedings, Part I. Cham, Switzerland: Springer Nature. p. 254. ISBN 978-3-030-45438-8.
- Hannousse, Abdelhakim (2021). "Searching relevant papers for software engineering secondary studies: Semantic Scholar coverage and identification role". IET Software. 15 (1): 126–146. doi:10.1049/sfw2.12011. ISSN 1751-8814. S2CID 234053002.
- "AI2 scales up Semantic Scholar search engine to encompass biomedical research". GeekWire. 2017-10-17. Archived from the original on 2018-01-19. Retrieved 2018-01-18.
- "Tech Moves: Allen Instititue Hires Amazon Alexa Machine Learning Leader; Microsoft Chairman Takes on New Investor Role; and More". GeekWire. 2018-05-02. Archived from the original on 2018-05-10. Retrieved 2018-05-09.
- "Semantic Scholar". Semantic Scholar. Archived from the original on 11 August 2019. Retrieved 11 August 2019.
- "AI2 joins forces with Microsoft Research to upgrade search tools for scientific studies". GeekWire. 2018-12-05. Archived from the original on 2019-08-25. Retrieved 2019-08-25.
- "The University of Chicago Press joins more than 500 publishers working with Semantic Scholar to improve search and discoverability". RCNi Company Limited. Retrieved 2021-11-22.
- Dunn, Adriana (December 14, 2020). "Semantic Scholar Adds 25 Million Scientific Papers in 2020 Through New Publisher Partnerships" (PDF). Semantic Scholar. Retrieved November 22, 2021.