NAV
shell ruby python

Introduction

Welcome to the Scholarcy APIs. We have three core API servies:

  1. Metadata extraction API at https://api.scholarcy.com This is a developer API that comprises a number of endpoints for extracting machine-readable knowledge as JSON data from documents in many formats. The service is optimised to work with research papers and articles, but should provide useful results for any document in any format.
  2. Synopsis API at https://summarizer.scholarcy.com/ This includes a web front end for testing and a developer endpoint at /summarize
  3. References API at https://ref.scholarcy.com/ This is a free, public, developer API that comprises endpoints for extracting references as JSON, XML, BibTeX, RIS, and CSV from PDF, Word and plain text documents. This service can be deployed for you on more powerful servers for use at scale.

We provide examples in Shell, Ruby, and Python. You can view code examples in the dark area to the right, and you can switch the programming language of the examples with the tabs in the top right.

Authentication

Authentication headers must be sent with every request:


# 1. Metadata API:

require 'rest-client'
AUTH_TOKEN = 'abcdef' # Your API key
API_DOMAIN = 'https://api.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/api/posters/generate'
headers = {"Authorization": "Bearer " + AUTH_TOKEN}
file_path = '/path/to/local/file.pdf'

request = RestClient::Request.new(
          :method => :post,
          :url => POST_ENDPOINT,
          :headers => headers,
          :payload => {
            :multipart => true,
            :file => File.new(file_path, 'rb')
          })
response = request.execute
puts(response.body)

# 2. Synopsis API:

require 'rest-client'
AUTH_TOKEN = 'abcdef' # Your API key
API_DOMAIN = 'https://summarizer.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/summarize'
headers = {"Authorization": "Bearer " + AUTH_TOKEN}
file_path = '/path/to/local/file.pdf'

request = RestClient::Request.new(
          :method => :post,
          :url => POST_ENDPOINT,
          :headers => headers,
          :payload => {
            :multipart => true,
            :file => File.new(file_path, 'rb')
          })
response = request.execute
puts(response.body)

# 3. References API:

require 'rest-client'
AUTH_TOKEN = 'abcdef' # Your API key
API_DOMAIN = 'https://ref.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/api/references/download'
headers = {"Authorization": "Bearer " + AUTH_TOKEN}
file_path = '/path/to/local/file.pdf'

request = RestClient::Request.new(
          :method => :post,
          :url => POST_ENDPOINT,
          :headers => headers,
          :payload => {
            :multipart => true,
            :file => File.new(file_path, 'rb')
          })
response = request.execute
puts(response.body)

# 1. Metadata API:

import requests
timeout = 30

AUTH_TOKEN = 'abcdefg' # Your API key
API_DOMAIN = 'https://api.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/api/posters/generate'
headers = {'Authorization': 'Bearer ' + AUTH_TOKEN}

file_path = '/path/to/local/file.pdf'

with open(file_path, 'rb') as file_data:
    file_payload = {'file': file_data}
    r = requests.post(POST_ENDPOINT,
          headers=headers,
          files=file_payload,
          timeout=timeout)
    print(r.json())

# 2. Synopsis API:

import requests
timeout = 30

AUTH_TOKEN = 'abcdefg' # Your API key
API_DOMAIN = 'https://summarizer.scholarcy.com/'
POST_ENDPOINT = API_DOMAIN + '/summarize'
headers = {'Authorization': 'Bearer ' + AUTH_TOKEN}

file_path = '/path/to/local/file.pdf'

with open(file_path, 'rb') as file_data:
    file_payload = {'file': file_data}
    r = requests.post(POST_ENDPOINT,
          headers=headers,
          files=file_payload,
          timeout=timeout)
    print(r.json())

# 3. References API:

import requests
timeout = 30

AUTH_TOKEN = 'abcdefg' # Your API key
API_DOMAIN = 'https://ref.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/api/references/download'
headers = {'Authorization': 'Bearer ' + AUTH_TOKEN}

file_path = '/path/to/local/file.pdf'

with open(file_path, 'rb') as file_data:
    file_payload = {'file': file_data}
    r = requests.post(POST_ENDPOINT,
          headers=headers,
          files=file_payload,
          timeout=timeout)
    print(r.json())

# 1. Metadata API:

# With shell, you can just pass the correct header with each request
curl "https://api.scholarcy.com/api/posters/generate" \
  -H "Authorization: Bearer abcdefg" \
  -F "file=@/path/to/local/file.pdf"

  curl "https://api.scholarcy.com/api/posters/generate" \
    -H "Authorization: Bearer abcdefg" \
    -d "url=https://www.nature.com/articles/s41746-019-0180-3"

# 2. Synopsis API:

# With shell, you can just pass the correct header with each request
curl "https://summarizer.scholarcy.com/summarize" \
  -H "Authorization: Bearer abcdefg" \
  -F "file=@/path/to/local/file.pdf"

  curl "https://summarizer.scholarcy.com/summarize" \
    -H "Authorization: Bearer abcdefg" \
    -d "url=https://www.nature.com/articles/s41746-019-0180-3"

# 3. References API:

# With shell, you can just pass the correct header with each request
curl "https://ref.scholarcy.com/api/references/download" \
  -H "Authorization: Bearer abcdefg" \
  -F "file=@/path/to/local/file.pdf"

  curl "https://ref.scholarcy.com/api/references/download" \
    -H "Authorization: Bearer abcdefg" \
    -d "url=https://www.nature.com/articles/s41746-019-0180-3.pdf"

Make sure to replace abcdefg with your API key.

Our APIs are currently open and can be used without authentication for a limited number of documents per day with documents below 7MB in size.

For unauthenticated use, please omit the Authorization header, or pass an empty string for the Bearer token.

For full, authenticated use of each API, you may contact us for an API key for either or both API services - please see our Pricing page.

Each API service (metadata and synopsis) requires the purchase of a separate key.

Soon you will be able to self-register an Scholarcy API key at our developer portal.

The Scholarcy APIs expects the API key to be included in all API requests to the server in a header that looks like the following:

Authorization: Bearer abcdefg

Generate a Poster

The API endpoints at https://api.scholarcy.com/api/posters/generate will extract the information needed to populate data into your own poster-creation services, and will also generate a basic Powerpoint template for you to use as a starting point for editing.

POST a local file to generate a poster

require 'rest-client'
AUTH_TOKEN = 'abcdef' # Your API key
API_DOMAIN = 'https://api.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/api/posters/generate'
headers = {"Authorization": "Bearer " + AUTH_TOKEN}
file_path = '/path/to/local/file.pdf'

request = RestClient::Request.new(
          :method => :post,
          :url => POST_ENDPOINT,
          :headers => headers,
          :payload => {
            :multipart => true,
            :file => File.new(file_path, 'rb'),
            :type => 'headline',
            :start_page => 24,
            :end_page => 37
          })
response = request.execute
puts(response.body)
import requests
timeout = 30

AUTH_TOKEN = 'abcdefg' # Your API key
API_DOMAIN = 'https://api.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/api/posters/generate'
headers = {'Authorization': 'Bearer ' + AUTH_TOKEN}

file_path = '/path/to/local/file.pdf'

params = {'type': 'headline', 'start_page': 24, 'end_page': 37}
with open(file_path, 'rb') as file_data:
    file_payload = {'file': file_data}
    r = requests.post(POST_ENDPOINT,
          headers=headers,
          files=file_payload,
          data=params,
          timeout=timeout)
    print(r.json())

curl "https://api.scholarcy.com/api/posters/generate" \
  -H "Authorization: Bearer abcdefg" \
  -F "file=@/path/to/local/file.pdf" \
  -F "type=headline" \
  -F "start_page=24" \
  -F "end_page=37"

The above command returns JSON structured like this:

{
  "filename": "filename.pdf",
  "content_type": "application/pdf",
  "file_size": 123456,
  "metadata": {
    "title": "Article title",
    "author": "Smith, J.",
    "pages": "15",
    "date": 2021,
    "affiliations": [
      "Department of Silly Walks, University of Life, London, UK",
    ],
    "identifiers": {
      "arxiv": null,
      "doi": "10.1010/101010.10.10.1101010",
      "isbn": null,
      "doc_id": null
    },
    "abstract": "This is a very exciting paper. Please read it.",
    "references": [
      "1. Smith, J. (2001) A study on self-citation. J. Chem. Biol., 123, 456-789.",
      "2. Jones, R. (2015) He didn't write this one. Science, 101, 101010."
    ],
    "emails": [
      "smith.j@uni.ac.uk"
    ],
    "figure_captions": [
      {
        "id": "1",
        "caption": "Figure 1 caption"
      },
      {
        "id": "2",
        "caption": "Figure 2 caption"
      }
    ],
    "figure_urls": [
      "https://api.scholarcy.com/images/file.pdf_agtuhsnt_images_1x1uj_5t/img-000.png",
      "https://api.scholarcy.com/images/file.pdf_agtuhsnt_images_1x1uj_5t/img-002.png"
    ],
    "poster_url": "https://api.scholarcy.com/posters/file.pdf_agtuhsnt.pptx",
    "keywords": [
      "atomic force microscopy",
      "dna nanostructure",
      "drug release",
      "single-stranded DNA",
      "double-stranded DNA",
      "dox molecule"
    ],
    "abbreviations": {
      "DONs": "DNA origami nanostructures",
      "ROS": "reactive oxygen species",
      "DNase I": "deoxyribonuclease I",
      "PEG": "polyethylene glycol"
    },
    "headline": "We prove some important facts in this paper",
    "highlights": [
      "Facts are very important.",
      "The force is strong in this one.",
      "We ran some tests and this is what we found"
    ],
    "summary": {
      "Introduction": [
        "Introduction paragraph 1",
        "Introduction paragraph 2",
      ],
      "Methods": [
        "We mixed some chemicals.",
        "We heated them up.",
        "We distilled the mixture."
      ],
      "Results": [
        "There was a big explosion",
        "But the crystals were pure",
        "We identified a new compound",
      ],
      "Conclusion": [
        "We proved some important things and we summarise them here.",
        "Further work is necessary"
      ]
    }
  }
}

This endpoint generates a poster from a local file. File formats supported are:

HTTP Request

POST http://api.scholarcy.com/api/posters/generate

Query Parameters

Parameter Default Description
file null A file object.
url null URL of public, open-access document.
type full The type of poster to generate. full will create a large, landscape poster with blocks for each section. headline will create a portrait poster containing the main takeaway finding and a single image.
start_page 1 Start reading the document from this page (PDF urls only). Useful for processing a single article/chapter within a larger file.
end_page null Stop reading the document from this page (PDF urls only). Useful for processing a single article/chapter within a larger file.

GET a poster from a URL

require 'rest-client'
AUTH_TOKEN = 'abcdef' # Your API key
API_DOMAIN = 'https://api.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/api/posters/generate'
headers = {"Authorization": "Bearer " + AUTH_TOKEN}

request = RestClient::Request.new(
          :method => :post,
          :url => POST_ENDPOINT,
          :headers => headers,
          :payload => {
            :multipart => true,
            :url => 'https://www.nature.com/articles/s41746-019-0180-3',
            :type => 'full',
            :start_page => 1
          })
response = request.execute
puts(response.body)

import requests
timeout = 30

AUTH_TOKEN = 'abcdefg' # Your API key
API_DOMAIN = 'https://api.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/api/posters/generate'
headers = {'Authorization': 'Bearer ' + AUTH_TOKEN}

params = {
'url': 'https://www.nature.com/articles/s41746-019-0180-3',
'type': 'full',
'start_page': 1
}
r = requests.post(POST_ENDPOINT,
      headers=headers,
      data=params,
      timeout=timeout)
print(r.json())

curl "https://api.scholarcy.com/api/posters/generate" \
  -H "Authorization: Bearer abcdefg" \
  -d "url=https://www.nature.com/articles/s41746-019-0180-3" \
  -d "type=full" \
  -d "start_page=1"

The above command returns JSON structured as for the POST endpoint:

{
  "filename": "filename.pdf",
  "content_type": "application/pdf",
  "file_size": 123456,
  "metadata": {
    "title": "Article title",
    "author": "Smith, J.",
    "pages": "15",
    "date": 2021,
    "affiliations": [
    ],
    "identifiers": {
      "arxiv": null,
      "doi": "10.1010/101010.10.10.1101010",
      "isbn": null,
      "doc_id": null
    },
    "abstract": "This is a very exciting paper. Please read it.",
    "references": [

    ],
    "emails": [
      "smith.j@uni.ac.uk"
    ],
    "figure_captions": [
      {
        "id": "1",
        "caption": "Figure 1 caption"
      },
      {
        "id": "2",
        "caption": "Figure 2 caption"
      }
    ],
    "figure_urls": [

    ],
    "poster_url": "https://api.scholarcy.com/posters/file.pdf_agtuhsnt.pptx",
    "keywords": [
    ],
    "abbreviations": {
    },
    "headline": "We prove some important facts in this paper",
    "highlights": [
    ],
    "summary": {
      "Introduction": [

      ],
      "Methods": [

      ],
      "Results": [

      ],
      "Conclusion": [

      ]
    }
  }
}

This endpoint generates a poster from a remote URL. The remote URL can resolve to a document type of any of the formats listed for the POST endpoint

HTTP Request

GET http://api.scholarcy.com/api/posters/generate

Query Parameters

Parameter Default Description
url null URL of public, open-access document.
type full The type of poster to generate. full will create a large, landscape poster with blocks for each section. headline will create a portrait poster containing the main takeaway finding and a single image.
start_page 1 Start reading the document from this page (PDF urls only). Useful for processing a single article/chapter within a larger file.
end_page null Stop reading the document from this page (PDF urls only). Useful for processing a single article/chapter within a larger file.

Extract Highlights

The API endpoints at https://api.scholarcy.com/api/highlights/extract will pull out the key findings/highlights of an article and also provide a longer, extractive summary.

POST a local file to extract highlights

require 'rest-client'
AUTH_TOKEN = 'abcdef' # Your API key
API_DOMAIN = 'https://api.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/api/highlights/extract'
headers = {"Authorization": "Bearer " + AUTH_TOKEN}
file_path = '/path/to/local/file.pdf'

request = RestClient::Request.new(
          :method => :post,
          :url => POST_ENDPOINT,
          :headers => headers,
          :payload => {
            :multipart => true,
            :file => File.new(file_path, 'rb'),
            :start_page => 24,
            :end_page => 37
          })
response = request.execute
puts(response.body)
import requests
timeout = 30

AUTH_TOKEN = 'abcdefg' # Your API key
API_DOMAIN = 'https://api.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/api/highlights/extract'
headers = {'Authorization': 'Bearer ' + AUTH_TOKEN}

file_path = '/path/to/local/file.pdf'

params = {'wiki_links': True, 'reference_links': True}
with open(file_path, 'rb') as file_data:
    file_payload = {'file': file_data}
    r = requests.post(POST_ENDPOINT,
          headers=headers,
          files=file_payload,
          data=params,
          timeout=timeout)
    print(r.json())

curl "https://api.scholarcy.com/api/highlights/extract" \
  -H "Authorization: Bearer abcdefg" \
  -F "file=@/path/to/local/file.pdf" \
  -F "start_page=24" \
  -F "end_page=37"

The above command returns JSON structured like this:

{
  "filename": "filename.pdf",
  "content_type": "application/pdf",
  "file_size": 123456,
  "metadata": {
    "title": "Article title",
    "author": "Smith, J.",
    "pages": "15",
    "date": 2021,
    "affiliations": [
      "Department of Silly Walks, University of Life, London, UK",
    ],
    "identifiers": {
      "arxiv": null,
      "doi": "10.1010/101010.10.10.1101010",
      "isbn": null,
      "doc_id": null
    },
    "abstract": "This is a very exciting paper. Please read it.",
    "funding": [
      {
        "award-group": [
          {
            "funding-source": "FEDER/COMPETE",
            "award-id": [
              "AAA/BBB/04007/2019"
            ]
          }
        ],
        "funding-statement": "We acknowledge financial support from Fundação para a Ciência e a Tecnologia and FEDER/COMPETE (grant AAA/BBB/04007/2019)"
      }
    ]
  },
  "keywords": [
    "atomic force microscopy",
    "dna nanostructure",
    "drug release",
    "single-stranded DNA",
    "double-stranded DNA",
    "dox molecule"
  ],
  "keyword_relevance": {
    "atomic force microscopy": 0.345678,
    "dna nanostructure": 0.23456,
    "drug release": 0.12345,
    "single-stranded DNA": 0.034567,
    "double-stranded DNA": 0.02345,
    "dox molecule": 0.01234
  },
  "abbreviations": {
    "DONs": "DNA origami nanostructures",
    "ROS": "reactive oxygen species",
    "DNase I": "deoxyribonuclease I",
    "PEG": "polyethylene glycol"
  },
  "headline": "We prove some important facts in this paper",
  "highlights": [
    "Facts are very important.",
    "The force is strong in this one.",
    "We ran some tests and this is what we found"
  ],
  "findings": [
    "A statistically significant difference was noted between the four groups on the combined dependent variables",
    "We also noted significant differences when we performed a one-way between-groups analysis of variance on each of the 14 items (P < 0.001)"
  ],
  "summary": [],
  "structured_summary": {
    "Introduction": [
      "Introduction paragraph 1",
      "Introduction paragraph 2",
    ],
    "Methods": [
      "We mixed some chemicals.",
      "We heated them up.",
      "We distilled the mixture."
    ],
    "Results": [
      "There was a big explosion",
      "But the crystals were pure",
      "We identified a new compound",
    ],
    "Conclusion": [
      "We proved some important things and we summarise them here.",
      "Further work is necessary"
    ]
  }
}

This endpoint extracts highlights from a local file. File formats supported are:

HTTP Request

POST http://api.scholarcy.com/api/highlights/extract

Query Parameters

Parameter Default Description
file null A file object.
url null URL of public, open-access document.
text null Plain text content to be processed.
start_page 1 Start reading the document from this page (PDF urls only). Useful for processing a single article/chapter within a larger file.
end_page null Stop reading the document from this page (PDF urls only). Useful for processing a single article/chapter within a larger file.
external_metadata false If true, fetch article metadata from the relevant remote repository (e.g. CrossRef).
wiki_links false If true, map extracted key terms to their Wikipedia pages
reference_links false If true, parse and link each reference to its full text location
replace_pronouns false If true, replace first-person pronouns with third-person mentions (the author(s)?, they).
key_points 5 The number of key points/key takeaway items to extract.
sampling representative For large documents, when extracting key terms, use either a representative sample of the full content, or the fulltext content.
extract_snippets true If true, sample snippets from each section, otherwise, sample the full text.
add_background_info false If true, generate an introductory sentence. Useful generating an abstract from an article.
add_concluding_info false If true, generate an concluding sentence. Useful generating an abstract from an article.
structured_summary false If true, summarise each of the main sections separately, and then provide a summary structured according to those sections.
summary_engine v1 v1: Best for articles. v2: best for book chapters.
highlights_algorithm weighted weighted: attend more closely to the results and conclusion. unweighted: attend to all content equally.
headline_from highlights highlights: use the highest scoring highlight as the headline. summary: use the first summary sentence as the headline. conclusions: use the first conclusion statement as a headline. claims: use the main claim as the headline.

GET highlights from a URL

require 'rest-client'
AUTH_TOKEN = 'abcdef' # Your API key
API_DOMAIN = 'https://api.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/api/highlights/extract'
headers = {"Authorization": "Bearer " + AUTH_TOKEN}

request = RestClient::Request.new(
          :method => :get,
          :url => POST_ENDPOINT,
          :headers => headers,
          :payload => {
            :url => 'https://www.nature.com/articles/s41746-019-0180-3',
            :start_page => 1
          })
response = request.execute
puts(response.body)

import requests
timeout = 30

AUTH_TOKEN = 'abcdefg' # Your API key
API_DOMAIN = 'https://api.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/api/highlights/extract'
headers = {'Authorization': 'Bearer ' + AUTH_TOKEN}

payload = {
'url': 'https://www.nature.com/articles/s41746-019-0180-3',
'start_page': 1
}
r = requests.get(POST_ENDPOINT,
      headers=headers,
      params=payload,
      timeout=timeout)
print(r.json())

curl "https://api.scholarcy.com/api/highlights/extract" \
  -H "Authorization: Bearer abcdefg" \
  -d "url=https://www.nature.com/articles/s41746-019-0180-3" \
  -d "start_page=1"

The above command returns JSON structured as for the POST endpoint:

{
  "filename": "filename.pdf",
  "content_type": "application/pdf",
  "file_size": 123456,
  "metadata": {
    "title": "Article title",
    "author": "Smith, J.",
    "pages": "15",
    "date": 2021,
    "affiliations": [
      "Department of Silly Walks, University of Life, London, UK",
    ],
    "identifiers": {
      "arxiv": null,
      "doi": "10.1010/101010.10.10.1101010",
      "isbn": null,
      "doc_id": null
    },
    "abstract": "This is a very exciting paper. Please read it.",
    "funding": [
      {
        "award-group": [
          {
            "funding-source": "FEDER/COMPETE",
            "award-id": [
              "AAA/BBB/04007/2019"
            ]
          }
        ],
        "funding-statement": "..."
      }
    ]
  },
  "keywords": [],
  "keyword_relevance": {},
  "abbreviations": {},
  "headline": "We prove some important facts in this paper",
  "highlights": [],
  "findings": [],
  "summary": [],
  "structured_summary": {
    "Introduction": [
    ],
    "Methods": [
    ],
    "Results": [
    ],
    "Conclusion": [
    ]
  }
}

This endpoint extracts highlights from a remote URL. The remote URL can resolve to a document type of any of the formats listed for the POST endpoint

HTTP Request

GET http://api.scholarcy.com/api/highlights/extract

Query Parameters

Parameter Default Description
url null URL of public, open-access document.
text null Plain text content to be processed.
start_page 1 Start reading the document from this page (PDF urls only). Useful for processing a single article/chapter within a larger file.
end_page null Stop reading the document from this page (PDF urls only). Useful for processing a single article/chapter within a larger file.
external_metadata false If true, fetch article metadata from the relevant remote repository (e.g. CrossRef).
wiki_links false If true, map extracted key terms to their Wikipedia pages
reference_links false If true, parse and link each reference to its full text location
replace_pronouns false If true, replace first-person pronouns with third-person mentions (the author(s)?, they).
key_points 5 The number of key points/key takeaway items to extract.
sampling representative For large documents, when extracting key terms, use either a representative sample of the full content, or the fulltext content.
extract_snippets true If true, sample snippets from each section, otherwise, sample the full text.
add_background_info false If true, generate an introductory sentence. Useful generating an abstract from an article.
add_concluding_info false If true, generate an concluding sentence. Useful generating an abstract from an article.
structured_summary false If true, summarise each of the main sections separately, and then provide a summary structured according to those sections.
summary_engine v1 v1: Best for articles. v2: best for book chapters.
highlights_algorithm weighted weighted: attend more closely to the results and conclusion. unweighted: attend to all content equally.
headline_from highlights highlights: use the highest scoring highlight as the headline. summary: use the first summary sentence as the headline. conclusions: use the first conclusion statement as a headline. claims: use the main claim as the headline.

Extract Structured Content

The API endpoints at https://api.scholarcy.com/api/metadata/extract and /api/metadata/basic will convert a document into structured, machine-readable data in JSON format.

POST a local file to extract content

require 'rest-client'
AUTH_TOKEN = 'abcdef' # Your API key
API_DOMAIN = 'https://api.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/api/metadata/extract'
headers = {"Authorization": "Bearer " + AUTH_TOKEN}
file_path = '/path/to/local/file.pdf'

request = RestClient::Request.new(
          :method => :post,
          :url => POST_ENDPOINT,
          :headers => headers,
          :payload => {
            :multipart => true,
            :file => File.new(file_path, 'rb'),
            :start_page => 24,
            :end_page => 37
          })
response = request.execute
puts(response.body)
import requests
timeout = 30

AUTH_TOKEN = 'abcdefg' # Your API key
API_DOMAIN = 'https://api.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/api/metadata/extract'
headers = {'Authorization': 'Bearer ' + AUTH_TOKEN}

file_path = '/path/to/local/file.pdf'

params = {'start_page': 24, 'end_page': 37}
with open(file_path, 'rb') as file_data:
    file_payload = {'file': file_data}
    r = requests.post(POST_ENDPOINT,
          headers=headers,
          files=file_payload,
          data=params,
          timeout=timeout)
    print(r.json())

curl "https://api.scholarcy.com/api/metadata/extract" \
  -H "Authorization: Bearer abcdefg" \
  -F "file=@/path/to/local/file.pdf" \
  -F "start_page=24" \
  -F "end_page=37"

The above command returns JSON structured like this:

{
  "filename": "filename.pdf",
  "content_type": "application/pdf",
  "file_size": 123456,
  "metadata": {
    "title": "Article title",
    "author": "Smith, J.",
    "pages": "15",
    "date": 2021,
    "affiliations": [
      "Department of Silly Walks, University of Life, London, UK",
    ],
    "identifiers": {
      "arxiv": null,
      "doi": "10.1010/101010.10.10.1101010",
      "isbn": null,
      "doc_id": null
    },
    "abstract": "This is a very exciting paper. Please read it.",
    "references": [],
    "emails": [
      "author@email.com"
    ],
    "funding": [
      {
        "award-group": [
          {
            "funding-source": "FEDER/COMPETE",
            "award-id": [
              "AAA/BBB/04007/2019"
            ]
          }
        ],
        "funding-statement": "We acknowledge financial support from Fundação para a Ciência e a Tecnologia and FEDER/COMPETE (grant AAA/BBB/04007/2019)"
      }
    ],
    "table_captions": [
      {
        "id": "1",
        "caption": "Sample demographics and characteristics"
      },
      {
        "id": "2",
        "caption": "Construct measurements"
      },
    ],
    "figure_captions": []
  },
  "sections": {
    "introduction": [
      "Introduction section contents"
    ],
    "methodology": [
      "Methods section contents"
    ],
    "findings": [
      "Main results contents"
    ],
    "conclusion": [
      "Concluding remarks"
    ],
    "limitations": [
      "There are also several limitations to this research. Small sample size was an issue."
    ],
    "acknowledgements": [
      "We'd like to thank our supervisors for support, tea and biscuits."
    ],
    "funding": [
      "The authors acknowledge financial support from Fundação para a Ciência e a Tecnologia and FEDER/COMPETE (grant AAA/BBB/04007/2019)"
    ],
    "future_work": [
      "More research is needed to better understand what is going on."
    ],
    "objectives": [
      "The aim of this research is to provide insights into the inner workings of cellular processes."
    ]
  },
  "structured_content": [
    {
      "heading": "ABSTRACT",
      "content": [
        "This is a very exciting paper. Please read it."
      ]
    },
    {
      "heading": "INTRODUCTION",
      "content": [
        "Introduction paragraph 1",
        "Introduction paragraph 2"
      ]
    },
    {
      "heading": "RESEARCH METHODOLOGY",
      "content": [
        "Methods paragraph 1",
        "Methods paragraph 2"
      ]
    },
    {
      "heading": "FINDINGS AND DISCUSSION",
      "content": [
        "Results paragraph 1",
        "Results paragraph 2",
      ]
    },
    {
      "heading": "CONCLUSION",
      "content": [
        "Conclusion paragraph 1",
        "Conclusion paragraph 2",
      ]
    }
  ],
  "participants": [
    {
      "participant": "Patients",
      "number": 15,
      "context": "Fifteen patients participated in the study."
    },
  ],
  "statistics": [
    {
      "tests": {
        "context": "We performed exploratory factor analysis using SPSS 20",
        "tests": [
          {
            "test": "exploratory factor analysis"
          }
        ]
      }
    },
    {
      "tests": {
        "context": "We performed confirmatory factor analyses with AMOS 20 using the maximum likelihood estimation method",
        "tests": [
          {
            "test": "confirmatory factor analyses"
          },
          {
            "test": "maximum likelihood estimation method"
          }
        ]
      }
    },
    {
      "p_value": "P < 0.001",
      "context": "We also noted significant differences when we performed a one-way between-groups analysis of variance on each of the 14 items (P < 0.001)</mark>",
      "tests": {
        "tests": [
          {
            "test": "analysis of variance",
            "value": "P < 0.001"
          }
        ]
      }
    }
  ],
  "keywords": [
    "atomic force microscopy",
    "dna nanostructure",
    "drug release",
    "single-stranded DNA",
    "double-stranded DNA",
    "dox molecule"
  ],
  "keyword_relevance": {
    "atomic force microscopy": 0.345678,
    "dna nanostructure": 0.23456,
    "drug release": 0.12345,
    "single-stranded DNA": 0.034567,
    "double-stranded DNA": 0.02345,
    "dox molecule": 0.01234
  },
  "abbreviations": {
    "DONs": "DNA origami nanostructures",
    "ROS": "reactive oxygen species",
    "DNase I": "deoxyribonuclease I",
    "PEG": "polyethylene glycol"
  },
  "headline": "We prove some important facts in this paper",
  "top_statements": [
    "Facts are very important.",
    "The force is strong in this one.",
    "We ran some tests and this is what we found"
  ],
  "findings": [
    "A statistically significant difference was noted between the four groups on the combined dependent variables",
    "We also noted significant differences when we performed a one-way between-groups analysis of variance on each of the 14 items (P < 0.001)"
  ],
  "facts": [],
  "claims": [],
  "summary": [],
  "structured_summary": {
    "Introduction": [
      "Introduction paragraph 1",
      "Introduction paragraph 2",
    ],
    "Methods": [
      "We mixed some chemicals.",
      "We heated them up.",
      "We distilled the mixture."
    ],
    "Results": [
      "There was a big explosion",
      "But the crystals were pure",
      "We identified a new compound",
    ],
    "Conclusion": [
      "We proved some important things and we summarise them here.",
      "Further work is necessary"
    ]
  }
}

This endpoint extracts structured content from a local file. File formats supported are:

HTTP Request

POST http://api.scholarcy.com/api/metadata/extract

Query Parameters

Parameter Default Description
file null A file object.
url null URL of public, open-access document.
text null Plain text content to be processed.
start_page 1 Start reading the document from this page (PDF urls only). Useful for processing a single article/chapter within a larger file.
end_page null Stop reading the document from this page (PDF urls only). Useful for processing a single article/chapter within a larger file.
external_metadata false If true, fetch article metadata from the relevant remote repository (e.g. CrossRef).
parse_references false If true, parse into BibTeX and link each reference to its full text location.
reference_style ensemble Referencing style used by the document, if known, or use the default. Available values : acs, ama, anystyle, apa, chicago, ensemble, experimental, harvard, ieee, mhra, mla, nature, vancouver.
reference_format text Output references in plain text or bibtex format.
generate_summary true Create an extractive summary of the article.
summary_engine v1 v1: Best for articles. v2: best for book chapters.
replace_pronouns false If true, replace first-person pronouns in summary with third-person mentions (the author(s)?, they).
strip_dialogue false If true, remove dialog and quoted text from input prior to summarising.
summary_size 400 Length of summary in words.
summary_percent 0 Length of summary as a % of the original article.
structured_summary false If true, summarise each of the main sections separately, and then provide a summary structured according to those sections.
keyword_method sgrank+acr Available values : sgrank, sgrank+np, sgrank+acr, textrank, np, regex.
keyword_sample representative For large documents, when extracting key terms, use either a representative sample of the full content, or the fulltext content.
keyword_limit 25 Target number of key terms to extract.
abbreviation_method schwartz Select an abbreviation extraction method. Available values: schwartz, statistical, ensemble.
wiki_links false If true, map extracted key terms to their Wikipedia pages.
extract_facts true Extract SVO-style factual statements from the article.
extract_claims true Extract specific claims made by the article.
key_points 5 The number of key points/key takeaway items to extract.
citation_contexts false If true, extract the inline citation contexts (preceding and current sentences).
inline_citation_links false If true, link inline citations to their identifiers in the references.
extract_pico true Extract population, intervention, control, outcome data.
extract_tables false If true, extract tabular data as CSV/Excel files.
extract_figures false If true, extract figures and images as PNG files.
require_captions true Requires an accompanying caption to trigger figure/table extraction.
extract_sections true Extracts section headers and paragraphs.
include_bodytext true If extracting sections, includes the main body text content for each section.
unstructured_content false If true, include a raw, unstructured text dump of the file.
extract_snippets true If true, sample snippets from each section, otherwise, sample the full text.
engine v1 PDF text extraction engine. v1: best general purpose. v2: best for articles containing marginal line numbering or narrow column gutters.
image_engine v1 Image extraction engine. v1: best for bitmap images. v2: best for line images. Available values : v1, v2, v1+v2.

Output fields

Field Description
filename The filename of the uploaded document, or input URL slug
content_type The file or URL MIME type
metadata Structured article metadata
message Any error or status messages
title Article title
author List of authors
pages Number of pages in the document
date Article date
affiliations Author affiliations
journal Journal title (from CrossRef)
volume Journal volume (from CrossRef)
page Journal page range (from CrossRef)
cited_by Citation count (from CrossRef)
identifiers Any identifier extracted from the document, such as DOI, ISBN, arXiv ID, or other identifier.
If an open-access version of the paper is available, the URL to that version will be displayed here.
abstract The author-written abstract, if available, or a proxy for the abstract, such as background, introduction, preface etc.
keywords Author-supplied keywords
references The plain reference strings extracted from the end of the article, or from the footnotes
emails Email addresses of the authors
type Article type: journal-article, book-chapter, preprint, web-page, review-article, case-study, report
references_ris RIS parse of the references
links Any URLs identified in the document
author_conclusions Author-stated conclusions/takeaways
funding Funding statement structured as follows: "award-group": [{"funding-source": "National Institutes of Health", "award-id": ["R43HL137469"] }]
table_captions Table captions
figure_captions Figure captions
tables_url Link to download the tables as Excel
figure_urls List of links to download extracted images as PNG files
word_count A range representing maximum and minimum estimated word count.
The maximum includes appendices and supplementary information.
The minimum includes the core article body text.
Both exclude references and footnotes.
is_oa Boolean flag if the document is open access or not.
This flag is only present if the input is a DOI URL, e.g. https://doi.org/10.1177/0846537120913497
oa_status Open access status: closed, bronze, green, or gold.
This flag is only present if the input is a DOI URL, e.g. https://doi.org/10.1177/0846537120913497
sections Snippets from each main section in the article
introduction, methods, results, conclusion If section headings can be mapped to standard names such as Introduction, Methods, Results, Conclusions, these snippets are shown here
funding Any funding statements
disclosures Any disclosures of conflicts of interest
ethical_compliance Any information about consent and ethical regulations
data_availability Any information about data and code availability related to this study
limitations Any discussion of study limitations
future_work Any information about further research needed and future work
registrations Any study registration identifiers
structured_content The section headings as they appear in the source document, along with their full section content.
participants Quantifiable information about the study subjects
statistics Information about statistical tests and analysis performed in the study
populations Quantifiable information about the population background
keywords A combination of the author-supplied keywords, plus new keywords or key terms extracted from the document
keyword_relevance keywords ranked by their relevance scores
species Any Latin species names detected
summary An extractive summary of the main points of the entire article.
structured_summary An extractive summary structured according to the main sections of the article.
reference_links Shown if reference parsing has been enabled.
This contains links to the full text for each of the references in the paper
facts Subject-predicate-object statements expressed in the article
claims Claims made by the authors of the study
findings Any important, quantitative findings extracted from the document, such as statistically significant results
key_statements A longer set of important sentences, from which the top_statements are selected.
top_statements The top 3-7 key points in the document.
Typically, these highlights will include introductory and concluding information, as well as the main claims and findings of the article
headline A short, one line summary of the entire article.
This headline attempts to express the main finding or main result of the paper.
abbreviations Abbreviations and their fully spelt out names, extracted from the document

GET structured content from a URL

require 'rest-client'
AUTH_TOKEN = 'abcdef' # Your API key
API_DOMAIN = 'https://api.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/api/metadata/extract'
headers = {"Authorization": "Bearer " + AUTH_TOKEN}

request = RestClient::Request.new(
          :method => :get,
          :url => POST_ENDPOINT,
          :headers => headers,
          :payload => {
            :url => 'https://www.nature.com/articles/s41746-019-0180-3',
            :start_page => 1
          })
response = request.execute
puts(response.body)

import requests
timeout = 30

AUTH_TOKEN = 'abcdefg' # Your API key
API_DOMAIN = 'https://api.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/api/metadata/extract'
headers = {'Authorization': 'Bearer ' + AUTH_TOKEN}

payload = {
'url': 'https://www.nature.com/articles/s41746-019-0180-3',
'start_page': 1
}
r = requests.get(POST_ENDPOINT,
      headers=headers,
      params=payload,
      timeout=timeout)
print(r.json())

curl "https://api.scholarcy.com/api/metadata/extract" \
  -H "Authorization: Bearer abcdefg" \
  -d "url=https://www.nature.com/articles/s41746-019-0180-3" \
  -d "start_page=1"

The above command returns JSON structured as for the POST endpoint:

{
  "filename": "filename.pdf",
  "content_type": "application/pdf",
  "file_size": 123456,
  "metadata": {
    "title": "Article title",
    "author": "Smith, J.",
    "pages": "15",
    "date": 2021,
    "affiliations": [
      "Department of Silly Walks, University of Life, London, UK",
    ],
    "identifiers": {
      "arxiv": null,
      "doi": "10.1010/101010.10.10.1101010",
      "isbn": null,
      "doc_id": null
    },
    "abstract": "This is a very exciting paper. Please read it.",
    "references": [],
    "emails": [
      "author@email.com"
    ],
    "funding": [],
    "table_captions": [],
    "figure_captions": []
  },
  "sections": {
    "introduction": [],
    "methodology": [],
    "findings": [],
    "conclusion": [],
    "limitations": [],
    "acknowledgements": [],
    "funding": [],
    "future_work": [],
    "objectives": []
  },
  "structured_content": [],
  "participants": [],
  "statistics": [],
  "keywords": [],
  "keyword_relevance": {},
  "abbreviations": {},
  "headline": "We prove some important facts in this paper",
  "top_statements": [],
  "findings": [],
  "facts": [],
  "claims": [],
  "summary": [],
  "structured_summary": {}
}

This endpoint extracts structured content from a remote URL. The remote URL can resolve to a document type of any of the formats listed for the POST endpoint

HTTP Request

GET http://api.scholarcy.com/api/metadata/extract

Query Parameters

Parameter Default Description
url null URL of public, open-access document.
text null Plain text content to be processed.
start_page 1 Start reading the document from this page (PDF urls only). Useful for processing a single article/chapter within a larger file.
end_page null Stop reading the document from this page (PDF urls only). Useful for processing a single article/chapter within a larger file.
external_metadata false If true, fetch article metadata from the relevant remote repository (e.g. CrossRef).
parse_references false If true, parse into BibTeX and link each reference to its full text location.
reference_style ensemble Referencing style used by the document, if known, or use the default. Available values : acs, ama, anystyle, apa, chicago, ensemble, experimental, harvard, ieee, mhra, mla, nature, vancouver.
reference_format text Output references in plain text or bibtex format.
generate_summary true Create an extractive summary of the article.
summary_engine v1 v1: Best for articles. v2: best for book chapters.
replace_pronouns false If true, replace first-person pronouns in summary with third-person mentions (the author(s)?, they).
strip_dialogue false If true, remove dialog and quoted text from input prior to summarising.
summary_size 400 Length of summary in words.
summary_percent 0 Length of summary as a % of the original article.
structured_summary false If true, summarise each of the main sections separately, and then provide a summary structured according to those sections.
keyword_method sgrank+acr Available values : sgrank, sgrank+np, sgrank+acr, textrank, np, regex.
keyword_sample representative For large documents, when extracting key terms, use either a representative sample of the full content, or the fulltext content.
keyword_limit 25 Target number of key terms to extract.
abbreviation_method schwartz Select an abbreviation extraction method. Available values: schwartz, statistical, ensemble.
wiki_links false If true, map extracted key terms to their Wikipedia pages.
extract_facts true Extract SVO-style factual statements from the article.
extract_claims true Extract specific claims made by the article.
key_points 5 The number of key points/key takeaway items to extract.
citation_contexts false If true, extract the inline citation contexts (preceding and current sentences).
inline_citation_links false If true, link inline citations to their identifiers in the references.
extract_pico true Extract population, intervention, control, outcome data.
extract_tables false If true, extract tabular data as CSV/Excel files.
extract_figures false If true, extract figures and images as PNG files.
require_captions true Requires an accompanying caption to trigger figure/table extraction.
extract_sections true Extracts section headers and paragraphs.
include_bodytext true If extracting sections, includes the main body text content for each section.
unstructured_content false If true, include a raw, unstructured text dump of the file.
extract_snippets true If true, sample snippets from each section, otherwise, sample the full text.
engine v1 PDF text extraction engine. v1: best general purpose. v2: best for articles containing marginal line numbering or narrow column gutters.
image_engine v1 Image extraction engine. v1: best for bitmap images. v2: best for line images. Available values : v1, v2, v1+v2.

Extract and Parse References

The API endpoint at https://ref.scholarcy.com/api/references/download

extracts and downloads references into a variety of formats:

The API endpoint at https://ref.scholarcy.com/api/references/extract

extracts references in JSON format and optionally provides a link resolver URL for each.

POST a local file and download references in your chosen format

require 'rest-client'
AUTH_TOKEN = 'abcdef' # Your API key
API_DOMAIN = 'https://ref.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/api/references/download'
headers = {"Authorization": "Bearer " + AUTH_TOKEN}
file_path = '/path/to/local/file.pdf'

request = RestClient::Request.new(
          :method => :post,
          :url => POST_ENDPOINT,
          :headers => headers,
          :payload => {
            :multipart => true,
            :file => File.new(file_path, 'rb'),
            :reference_format => 'jats',
            :reference_style => 'ensemble'
          })
response = request.execute
puts(response.body)
import requests
timeout = 30

AUTH_TOKEN = 'abcdefg' # Your API key
API_DOMAIN = 'https://api.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/api/references/download'
headers = {'Authorization': 'Bearer ' + AUTH_TOKEN}

file_path = '/path/to/local/file.pdf'

params = {'reference_format': 'jats', 'reference_style': 'ensemble'}
with open(file_path, 'rb') as file_data:
    file_payload = {'file': file_data}
    r = requests.post(POST_ENDPOINT,
          headers=headers,
          files=file_payload,
          data=params,
          timeout=timeout)
    print(r.json())

curl "https://ref.scholarcy.com/api/references/download" \
  -H "Authorization: Bearer abcdefg" \
  -F "file=@/path/to/local/file.pdf" \
  -F "reference_format=jats" \
  -F "reference_style=ensemble"

The above command returns XML structured like this:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Archiving and Interchange DTD v2.3 20070202//EN" "https://jats.nlm.nih.gov/archiving/1.1d1/JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" article-type="research-article">
    <front>
        <article-meta>
            <article-id pub-id-type="doi">10.1371/journal.pone.0111913</article-id>
        </article-meta>
    </front>
    <back>
        <ref-list>
            <title>References</title>
            <ref id="ref_1">
                <element-citation publication-type="journal">
                    <person-group person-group-type="author">
                        <string-name>Teuten, E.</string-name>
                        <string-name>Rowland, S.</string-name>
                        <string-name>Galloway, T.</string-name>
                        <string-name>Thompson, R.</string-name>
                    </person-group>
                    <year>2007</year>
                    <article-title>Potential for plastics to transport hydrophobic contaminants</article-title>
                    <source>Environ Sci Technol</source>
                    <volume>41</volume>
                    <fpage>7759</fpage>
                </element-citation>
            </ref>

            <ref id="ref_2">
                <element-citation publication-type="journal">
                    <person-group person-group-type="author">
                        <string-name>Mato, Y.</string-name>
                        <string-name>Isobe, T.</string-name>
                        <string-name>Takada, H.</string-name>
                        <string-name>Kanehiro, H.</string-name>
                        <string-name>Ohtake, C.</string-name>
                    </person-group>
                    <year>2001</year>
                    <article-title>Plastic resin pellets as a transport medium for toxic chemicals in the marine environment</article-title>
                    <source>Environ Sci Technol</source>
                    <volume>35</volume>
                    <fpage>318</fpage>
                </element-citation>
            </ref>
          </ref-list>
      </back>
  </article>

This endpoint extracts references in the chosen output format from a local file. File formats supported are:

HTTP Request

POST http://ref.scholarcy.com/api/references/download

Query Parameters

Parameter Default Description
file null A file object.
document_type full_paper full_paper: a complete research paper, chapter or thesis. bibliography: a file containing just bibliographic items.
references null Optional. If you don't want to upload a file, you can pass a text string containing line-delimited references that you wish to parse into the desired output format.
reference_style ensemble Referencing style used by the document. If unsure, use the default ensemble or choose experimental. Other options include: acs, ama, apa, chicago, harvard, ieee, mhra, mla, nature, vancouver.
reference_format ris Output format. Options include: bibtex: standard BibTeX format. ris: Reference Interchange Specification (Endnote). xml: CrossRef's XML format. jats: The Journal Article Tag Suite XML format for references.
parent_doi null Only required for CrossRef XML output. If the DOI of the input document is not easily extractable from the document itself, then you can provide it here.
parent_title null Only requried for CrossRef XML output. If the title of the input document is not easily extractable from the document itself, then you can provide it here.
engine v1 PDF processing engine. v1: uses the XPDF tool and works best for most PDFs. v2: uses the poppler tool and may work better for PDFs with marginal line numbering, with multiple columns, or for those PDFs where v1 fails to extract useful information.

GET references from a remote URL and download them in your chosen format

require 'rest-client'
AUTH_TOKEN = 'abcdef' # Your API key
API_DOMAIN = 'https://ref.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/api/references/download'
headers = {"Authorization": "Bearer " + AUTH_TOKEN}

request = RestClient::Request.new(
          :method => :get,
          :url => POST_ENDPOINT,
          :headers => headers,
          :payload => {
            :url => 'https://www.nature.com/articles/s41746-019-0180-3.pdf',
            :reference_format => 'jats',
            :reference_style => 'ensemble'
          })
response = request.execute
puts(response.body)

import requests
timeout = 30

AUTH_TOKEN = 'abcdefg' # Your API key
API_DOMAIN = 'https://ref.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/api/references/download'
headers = {'Authorization': 'Bearer ' + AUTH_TOKEN}

payload = {
'url': 'https://www.nature.com/articles/s41746-019-0180-3.pdf',
'reference_format': 'jats',
'reference_style': 'ensemble'
}
r = requests.get(POST_ENDPOINT,
      headers=headers,
      params=payload,
      timeout=timeout)
print(r.json())

curl "https://ref.scholarcy.com/api/references/download" \
  -H "Authorization: Bearer abcdefg" \
  -d "url=https://www.nature.com/articles/s41746-019-0180-3.pdf" \
  -d "reference_format=jats" \
  -d "reference_style=ensemble"

The above command returns XML structured as for the POST endpoint:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Archiving and Interchange DTD v2.3 20070202//EN" "https://jats.nlm.nih.gov/archiving/1.1d1/JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" article-type="research-article">
    <front>
        <article-meta>
            <article-id pub-id-type="doi">10.1371/journal.pone.0111913</article-id>
        </article-meta>
    </front>
    <back>
        <ref-list>
            <title>References</title>
            <ref id="ref_1">
                <element-citation publication-type="journal">
                    <person-group person-group-type="author">
                        <string-name>Teuten, E.</string-name>
                        <string-name>Rowland, S.</string-name>
                        <string-name>Galloway, T.</string-name>
                        <string-name>Thompson, R.</string-name>
                    </person-group>
                    <year>2007</year>
                    <article-title>Potential for plastics to transport hydrophobic contaminants</article-title>
                    <source>Environ Sci Technol</source>
                    <volume>41</volume>
                    <fpage>7759</fpage>
                </element-citation>
            </ref>
          </ref-list>
      </back>
  </article>

This endpoint extracts references in the chosen output format from a remote URL. The remote URL can resolve to a document type of any of the formats listed for the POST endpoint

HTTP Request

GET http://ref.scholarcy.com/api/references/download

Query Parameters

Parameter Default Description
url null URL of public, open-access document.
document_type full_paper full_paper: a complete research paper, chapter or thesis. bibliography: a file containing just bibliographic items.
references null Optional. If you don't want to upload a file, you can pass a text string containing line-delimited references that you wish to parse into the desired output format.
reference_style ensemble Referencing style used by the document. If unsure, use the default ensemble or choose experimental. Other options include: acs, ama, apa, chicago, harvard, ieee, mhra, mla, nature, vancouver.
reference_format ris Output format. Options include: bibtex: standard BibTeX format. ris: Reference Interchange Specification (Endnote). xml: CrossRef's XML format. jats: The Journal Article Tag Suite XML format for references.
parent_doi null Only required for CrossRef XML output. If the DOI of the input document is not easily extractable from the document itself, then you can provide it here.
parent_title null Only requried for CrossRef XML output. If the title of the input document is not easily extractable from the document itself, then you can provide it here.
engine v1 PDF processing engine. v1: uses the XPDF tool and works best for most PDFs. v2: uses the poppler tool and may work better for PDFs with marginal line numbering, with multiple columns, or for those PDFs where v1 fails to extract useful information.

POST a local file and return references as JSON data

require 'rest-client'
AUTH_TOKEN = 'abcdef' # Your API key
API_DOMAIN = 'https://ref.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/api/references/extract'
headers = {"Authorization": "Bearer " + AUTH_TOKEN}
file_path = '/path/to/local/file.pdf'

request = RestClient::Request.new(
          :method => :post,
          :url => POST_ENDPOINT,
          :headers => headers,
          :payload => {
            :multipart => true,
            :file => File.new(file_path, 'rb'),
            :reference_style => 'ensemble',
            :resolve_references => True
          })
response = request.execute
puts(response.body)
import requests
timeout = 30

AUTH_TOKEN = 'abcdefg' # Your API key
API_DOMAIN = 'https://api.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/api/references/extract'
headers = {'Authorization': 'Bearer ' + AUTH_TOKEN}

file_path = '/path/to/local/file.pdf'

params = {'reference_style': 'ensemble', 'resolve_references': True}
with open(file_path, 'rb') as file_data:
    file_payload = {'file': file_data}
    r = requests.post(POST_ENDPOINT,
          headers=headers,
          files=file_payload,
          data=params,
          timeout=timeout)
    print(r.json())

curl "https://ref.scholarcy.com/api/references/extract" \
  -H "Authorization: Bearer abcdefg" \
  -F "file=@/path/to/local/file.pdf" \
  -F "reference_style=ensemble" \
  -F "resolve_references=true"

The above command returns JSON structured like this:

{
  "filename": "plastic_pollution.pdf",
  "metadata": {
    "arxiv": null,
    "doi": "10.1371/journal.pone.0111913",
    "isbn": null,
    "date": 2014
  },
  "references": [
    "1. Teuten E, Rowland S, Galloway T, Thompson R (2007) Potential for plastics to transport hydrophobic contaminants. Environ Sci Technol 41: 7759–7764.",
    "2. Mato Y, Isobe T, Takada H, Kanehiro H, Ohtake C, et al. (2001) Plastic resin pellets as a transport medium for toxic chemicals in the marine environment. Environ Sci Technol 35: 318–324.",
    "3. Rochman C, Browne M, Halpern B, Hentschel B, Hoh E, et al. (2013) Classify plastic waste as hazardous. Nature 494: 169–171.",
    "4. Barnes D, Galgani F, Thompson R, Barlaz M (2009) Accumulation and fragmentation of plastic debris in global environments. Philos Trans R Soc Lond B Biol Sci 364: 1985–1998.",
    "5. Barnes D, Walters A, Goncalves L (2010) Macroplastics at sea around Antarctica. Mar Environ Res 70: 250–252.",
    "6. Law K, Moret-Ferguson S, Maximenko N, Proskurowski G, Peacock E, et al. (2010) Plastic accumulation in the North Atlantic Subtropical Gyre. Science 329: 1185–1188.",
    "7. Eriksen M, Maximenko N, Thiel M, Cummins A, Lattin G, et al. (2013) Plastic marine pollution in the South Pacific Subtropical Gyre. Mar Pollut Bull 68: 71–76.",
    "8. Goldstein M, Titmus A, Ford M (2013) Scales of spatial heterogeneity of plastic marine debris in the northeast Pacific Ocean, PloS one 8: doi: 10.1371/journal.pone.0080020.",
    "9. Law K, Moret-Ferguson S, Goodwin D, Zettler E, DeForce E, et al. (2014) Distribution of surface plastic debris in the eastern Pacific Ocean from an 11-year dataset. Environ Sci Technol: doi: 10.1021/ es4053076.",
    "10. Reisser J, Shaw J, Wilcox C, Hardesty B, Proietti M (2013) Marine plastic pollution in the waters around Australia: Characteristics, concentrations and pathways. PloS one 8: doi:10.1371/ journal.pone.0080466.",
  ],
  "bibtex": "@article{teuten2007a,\n  author = {Teuten, E. and Rowland, S. and Galloway, T. and Thompson, R.},\n  date = {2007},\n  title = {Potential for plastics to transport hydrophobic contaminants},\n  journal = {Environ Sci Technol},\n  volume = {41},\n  pages = {7759–7764},\n  language = {}\n}\n@article{mato2001a,\n  author = {Mato, Y. and Isobe, T. and Takada, H. and Kanehiro, H. and Ohtake, C.},\n  date = {2001},\n  title = {Plastic resin pellets as a transport medium for toxic chemicals in the marine environment},\n  journal = {Environ Sci Technol},\n  volume = {35},\n  pages = {318–324},\n  more-authors = {true},\n  language = {}\n}\n@article{rochman2013a,\n  author = {Rochman, C. and Browne, M. and Halpern, B. and Hentschel, B. and Hoh, E.},\n  date = {2013},\n  title = {Classify plastic waste as hazardous},\n  journal = {Nature},\n  volume = {494},\n  pages = {169–171},\n  more-authors = {true},\n  language = {}\n}\n@article{barnes2009a,\n  author = {Barnes, D. and Galgani, F. and Thompson, R. and Barlaz, M.},\n  date = {2009},\n  title = {Accumulation and fragmentation of plastic debris in global environments},\n  journal = {Philos Trans R Soc Lond B Biol Sci},\n  volume = {364},\n  pages = {1985–1998},\n  language = {}\n}\n@article{barnes2010a,\n  author = {Barnes, D. and Walters, A. and Goncalves, L.},\n  date = {2010},\n  title = {Macroplastics at sea around Antarctica},\n  journal = {Mar Environ Res},\n  volume = {70},\n  pages = {250–252},\n  language = {}\n}\n@article{law2010a,\n  author = {Law, K. and Moret-Ferguson, S. and Maximenko, N. and Proskurowski, G. and Peacock, E.},\n  date = {2010},\n  title = {Plastic accumulation in the North Atlantic Subtropical Gyre},\n  journal = {Science},\n  volume = {329},\n  pages = {1185–1188},\n  more-authors = {true},\n  language = {}\n}\n@article{eriksen2013a,\n  author = {Eriksen, M. and Maximenko, N. and Thiel, M. and Cummins, A. and Lattin, G.},\n  date = {2013},\n  title = {Plastic marine pollution in the South Pacific Subtropical Gyre},\n  journal = {Mar Pollut Bull},\n  volume = {68},\n  pages = {71–76},\n  more-authors = {true},\n  language = {}\n}\n@book{goldstein2013a,\n  author = {Goldstein, M. and Titmus, A. and Ford, M.},\n  date = {2013},\n  title = {Scales of spatial heterogeneity of plastic marine debris in the northeast Pacific Ocean},\n  publisher = {PloS one 8},\n  doi = {doi: 10.1371/journal.pone.0080020},\n  language = {}\n}\n@article{law2014a,\n  author = {Law, K. and Moret-Ferguson, S. and Goodwin, D. and Zettler, E. and DeForce, E.},\n  date = {2014},\n  title = {Distribution of surface plastic debris in the eastern Pacific Ocean from an 11-year dataset},\n  journal = {Environ Sci Technol},\n  doi = {doi: 10.1021/es4053076},\n  more-authors = {true},\n  language = {}\n}\n@book{reisser2013a,\n  author = {Reisser, J. and Shaw, J. and Wilcox, C. and Hardesty, B. and Proietti, M.},\n  date = {2013},\n  title = {Marine plastic pollution in the waters around Australia: Characteristics, concentrations and pathways},\n  publisher = {PloS one 8},\n  doi = {doi: 10.1371/journal.pone.0080466},\n  language = {}\n}\n",
  "ris": "TY  - JOUR\nAU  - Teuten, E.\nAU  - Rowland, S.\nAU  - Galloway, T.\nAU  - Thompson, R.\nPY  - 2007\nDA  - 2007\nTI  - Potential for plastics to transport hydrophobic contaminants\nT2  - Environ Sci Technol\nVL  - 41\nSP  - 7759\nEP  - 7764\nLA  - \nER  - \n\nTY  - JOUR\nAU  - Mato, Y.\nAU  - Isobe, T.\nAU  - Takada, H.\nAU  - Kanehiro, H.\nAU  - Ohtake, C.\nPY  - 2001\nDA  - 2001\nTI  - Plastic resin pellets as a transport medium for toxic chemicals in the marine environment\nT2  - Environ Sci Technol\nVL  - 35\nSP  - 318\nEP  - 324\nC1  - true\nLA  - \nER  - \n\nTY  - JOUR\nAU  - Rochman, C.\nAU  - Browne, M.\nAU  - Halpern, B.\nAU  - Hentschel, B.\nAU  - Hoh, E.\nPY  - 2013\nDA  - 2013\nTI  - Classify plastic waste as hazardous\nT2  - Nature\nVL  - 494\nSP  - 169\nEP  - 171\nC1  - true\nLA  - \nER  - \n\nTY  - JOUR\nAU  - Barnes, D.\nAU  - Galgani, F.\nAU  - Thompson, R.\nAU  - Barlaz, M.\nPY  - 2009\nDA  - 2009\nTI  - Accumulation and fragmentation of plastic debris in global environments\nT2  - Philos Trans R Soc Lond B Biol Sci\nVL  - 364\nSP  - 1985\nEP  - 1998\nLA  - \nER  - \n\nTY  - JOUR\nAU  - Barnes, D.\nAU  - Walters, A.\nAU  - Goncalves, L.\nPY  - 2010\nDA  - 2010\nTI  - Macroplastics at sea around Antarctica\nT2  - Mar Environ Res\nVL  - 70\nSP  - 250\nEP  - 252\nLA  - \nER  - \n\nTY  - JOUR\nAU  - Law, K.\nAU  - Moret-Ferguson, S.\nAU  - Maximenko, N.\nAU  - Proskurowski, G.\nAU  - Peacock, E.\nPY  - 2010\nDA  - 2010\nTI  - Plastic accumulation in the North Atlantic Subtropical Gyre\nT2  - Science\nVL  - 329\nSP  - 1185\nEP  - 1188\nC1  - true\nLA  - \nER  - \n\nTY  - JOUR\nAU  - Eriksen, M.\nAU  - Maximenko, N.\nAU  - Thiel, M.\nAU  - Cummins, A.\nAU  - Lattin, G.\nPY  - 2013\nDA  - 2013\nTI  - Plastic marine pollution in the South Pacific Subtropical Gyre\nT2  - Mar Pollut Bull\nVL  - 68\nSP  - 71\nEP  - 76\nC1  - true\nLA  - \nER  - \n\nTY  - BOOK\nAU  - Goldstein, M.\nAU  - Titmus, A.\nAU  - Ford, M.\nPY  - 2013\nDA  - 2013\nTI  - Scales of spatial heterogeneity of plastic marine debris in the northeast Pacific Ocean\nPB  - PloS one 8\nDO  - 10.1371/journal.pone.0080020\nLA  - \nER  - \n\nTY  - JOUR\nAU  - Law, K.\nAU  - Moret-Ferguson, S.\nAU  - Goodwin, D.\nAU  - Zettler, E.\nAU  - DeForce, E.\nPY  - 2014\nDA  - 2014\nTI  - Distribution of surface plastic debris in the eastern Pacific Ocean from an 11-year dataset\nT2  - Environ Sci Technol\nDO  - 10.1021/es4053076\nC1  - true\nLA  - \nER  - \n\nTY  - BOOK\nAU  - Reisser, J.\nAU  - Shaw, J.\nAU  - Wilcox, C.\nAU  - Hardesty, B.\nAU  - Proietti, M.\nPY  - 2013\nDA  - 2013\nTI  - Marine plastic pollution in the waters around Australia: Characteristics, concentrations and pathways\nPB  - PloS one 8\nDO  - 10.1371/journal.pone.0080466\nLA  - \nER  - \n\n,
  "reference_links": [
    {
      "id": "1",
      "entry": "1. Teuten E, Rowland S, Galloway T, Thompson R (2007) Potential for plastics to transport hydrophobic contaminants. Environ Sci Technol 41: 77597764.",
      "scholar_url": "https://scholar.google.co.uk/scholar?q=Teuten%2C%20E.%20Rowland%2C%20S.%20Galloway%2C%20T.%20Thompson%2C%20R.%20Potential%20for%20plastics%20to%20transport%20hydrophobic%20contaminants%202007",
      "oa_query": "https://ref.scholarcy.com/oa_version?query=Teuten%2C%20E.%20Rowland%2C%20S.%20Galloway%2C%20T.%20Thompson%2C%20R.%20Potential%20for%20plastics%20to%20transport%20hydrophobic%20contaminants%202007"
    },
    {
      "id": "2",
      "entry": "2. Mato Y, Isobe T, Takada H, Kanehiro H, Ohtake C, et al. (2001) Plastic resin pellets as a transport medium for toxic chemicals in the marine environment. Environ Sci Technol 35: 318324.",
      "scholar_url": "https://scholar.google.co.uk/scholar?q=Mato%2C%20Y.%20Isobe%2C%20T.%20Takada%2C%20H.%20Kanehiro%2C%20H.%20Plastic%20resin%20pellets%20as%20a%20transport%20medium%20for%20toxic%20chemicals%20in%20the%20marine%20environment%202001",
      "oa_query": "https://ref.scholarcy.com/oa_version?query=Mato%2C%20Y.%20Isobe%2C%20T.%20Takada%2C%20H.%20Kanehiro%2C%20H.%20Plastic%20resin%20pellets%20as%20a%20transport%20medium%20for%20toxic%20chemicals%20in%20the%20marine%20environment%202001"
    },
    {
      "id": "3",
      "entry": "3. Rochman C, Browne M, Halpern B, Hentschel B, Hoh E, et al. (2013) Classify plastic waste as hazardous. Nature 494: 169171.",
      "scholar_url": "https://scholar.google.co.uk/scholar?q=Rochman%2C%20C.%20Browne%2C%20M.%20Halpern%2C%20B.%20Hentschel%2C%20B.%20Classify%20plastic%20waste%20as%20hazardous%202013",
      "oa_query": "https://ref.scholarcy.com/oa_version?query=Rochman%2C%20C.%20Browne%2C%20M.%20Halpern%2C%20B.%20Hentschel%2C%20B.%20Classify%20plastic%20waste%20as%20hazardous%202013"
    },
    {
      "id": "4",
      "entry": "4. Barnes D, Galgani F, Thompson R, Barlaz M (2009) Accumulation and fragmentation of plastic debris in global environments. Philos Trans R Soc Lond B Biol Sci 364: 19851998.",
      "scholar_url": "https://scholar.google.co.uk/scholar?q=Barnes%2C%20D.%20Galgani%2C%20F.%20Thompson%2C%20R.%20Barlaz%2C%20M.%20Accumulation%20and%20fragmentation%20of%20plastic%20debris%20in%20global%20environments%202009",
      "oa_query": "https://ref.scholarcy.com/oa_version?query=Barnes%2C%20D.%20Galgani%2C%20F.%20Thompson%2C%20R.%20Barlaz%2C%20M.%20Accumulation%20and%20fragmentation%20of%20plastic%20debris%20in%20global%20environments%202009"
    },
    {
      "id": "5",
      "entry": "5. Barnes D, Walters A, Goncalves L (2010) Macroplastics at sea around Antarctica. Mar Environ Res 70: 250252.",
      "scholar_url": "https://scholar.google.co.uk/scholar?q=Barnes%2C%20D.%20Walters%2C%20A.%20Goncalves%2C%20L.%20Macroplastics%20at%20sea%20around%20Antarctica%202010",
      "oa_query": "https://ref.scholarcy.com/oa_version?query=Barnes%2C%20D.%20Walters%2C%20A.%20Goncalves%2C%20L.%20Macroplastics%20at%20sea%20around%20Antarctica%202010"
    },
    {
      "id": "6",
      "entry": "6. Law K, Moret-Ferguson S, Maximenko N, Proskurowski G, Peacock E, et al. (2010) Plastic accumulation in the North Atlantic Subtropical Gyre. Science 329: 11851188.",
      "scholar_url": "https://scholar.google.co.uk/scholar?q=Law%2C%20K.%20Moret-Ferguson%2C%20S.%20Maximenko%2C%20N.%20Proskurowski%2C%20G.%20Plastic%20accumulation%20in%20the%20North%20Atlantic%20Subtropical%20Gyre%202010",
      "oa_query": "https://ref.scholarcy.com/oa_version?query=Law%2C%20K.%20Moret-Ferguson%2C%20S.%20Maximenko%2C%20N.%20Proskurowski%2C%20G.%20Plastic%20accumulation%20in%20the%20North%20Atlantic%20Subtropical%20Gyre%202010"
    },
    {
      "id": "7",
      "entry": "7. Eriksen M, Maximenko N, Thiel M, Cummins A, Lattin G, et al. (2013) Plastic marine pollution in the South Pacific Subtropical Gyre. Mar Pollut Bull 68: 7176.",
      "scholar_url": "https://scholar.google.co.uk/scholar?q=Eriksen%2C%20M.%20Maximenko%2C%20N.%20Thiel%2C%20M.%20Cummins%2C%20A.%20Plastic%20marine%20pollution%20in%20the%20South%20Pacific%20Subtropical%20Gyre%202013",
      "oa_query": "https://ref.scholarcy.com/oa_version?query=Eriksen%2C%20M.%20Maximenko%2C%20N.%20Thiel%2C%20M.%20Cummins%2C%20A.%20Plastic%20marine%20pollution%20in%20the%20South%20Pacific%20Subtropical%20Gyre%202013"
    },
    {
      "id": "8",
      "entry": "8. Goldstein M, Titmus A, Ford M (2013) Scales of spatial heterogeneity of plastic marine debris in the northeast Pacific Ocean, PloS one 8: doi: 10.1371/journal.pone.0080020.",
      "crossref": "https://dx.doi.org/10.1371/journal.pone.0080020"
    },
    {
      "id": "9",
      "entry": "9. Law K, Moret-Ferguson S, Goodwin D, Zettler E, DeForce E, et al. (2014) Distribution of surface plastic debris in the eastern Pacific Ocean from an 11-year dataset. Environ Sci Technol: doi: 10.1021/ es4053076.",
      "crossref": "https://dx.doi.org/10.1021/es4053076",
      "oa_query": "https://ref.scholarcy.com/oa_version?query=https%3A//dx.doi.org/10.1021/es4053076"
    },
    {
      "id": "10",
      "entry": "10. Reisser J, Shaw J, Wilcox C, Hardesty B, Proietti M (2013) Marine plastic pollution in the waters around Australia: Characteristics, concentrations and pathways. PloS one 8: doi:10.1371/ journal.pone.0080466.",
      "crossref": "https://dx.doi.org/10.1371/journal.pone.0080466"
    }
  ]
}

This endpoint extracts references from a local file, returned JSON data with embedded RIS and BibTeX content. File formats supported are:

HTTP Request

POST http://ref.scholarcy.com/api/references/extract

Query Parameters

Parameter Default Description
file null A file object.
document_type full_paper full_paper: a complete research paper, chapter or thesis. bibliography: a file containing just bibliographic items.
references null Optional. If you don't want to upload a file, you can pass a text string containing line-delimited references that you wish to parse into the desired output format.
resolve_references true If true, parse each reference into BibTeX and RIS data and create link resolvers for each reference.
reference_style ensemble Referencing style used by the document. If unsure, use the default ensemble or choose experimental. Other options include: acs, ama, apa, chicago, harvard, ieee, mhra, mla, nature, vancouver.
engine v1 PDF processing engine. v1: uses the XPDF tool and works best for most PDFs. v2: uses the poppler tool and may work better for PDFs with marginal line numbering, with multiple columns, or for those PDFs where v1 fails to extract useful information.

GET references from a remote document URL and return JSON data

require 'rest-client'
AUTH_TOKEN = 'abcdef' # Your API key
API_DOMAIN = 'https://ref.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/api/references/extract'
headers = {"Authorization": "Bearer " + AUTH_TOKEN}

request = RestClient::Request.new(
          :method => :get,
          :url => POST_ENDPOINT,
          :headers => headers,
          :payload => {
            :url => 'https://www.nature.com/articles/s41746-019-0180-3.pdf',
            :resolve_references => True,
            :reference_style => 'ensemble'
          })
response = request.execute
puts(response.body)

import requests
timeout = 30

AUTH_TOKEN = 'abcdefg' # Your API key
API_DOMAIN = 'https://ref.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/api/references/extract'
headers = {'Authorization': 'Bearer ' + AUTH_TOKEN}

payload = {
'url': 'https://www.nature.com/articles/s41746-019-0180-3.pdf',
'resolve_references': True,
'reference_style': 'ensemble'
}
r = requests.get(POST_ENDPOINT,
      headers=headers,
      params=payload,
      timeout=timeout)
print(r.json())

curl "https://ref.scholarcy.com/api/references/extract" \
  -H "Authorization: Bearer abcdefg" \
  -d "url=https://www.nature.com/articles/s41746-019-0180-3.pdf" \
  -d "resolve_references=true" \
  -d "reference_style=ensemble"

The above command returns JSON data as for the POST endpoint:

{
  "filename": "s41746-019-0180-3.pdf",
  "metadata": {
    "arxiv": null,
    "doi": "10.1038/s41746-019-0180-3",
    "isbn": null,
    "date": 2019
  },
  "references": [
    "2. Hagglund, K. J. et al. Weather, beliefs about weather, and disease severity among patients with fibromyalgia. Arthritis Rheum. 7, 130–135 (1994).",
    "3. Timmermans, E. J. et al. Self-perceived weather sensitivity and joint pain in older people with osteoarthritis in six European countries: Results from the European Project on OSteoArthritis (EPOSA). BMC Musculoskelet. Disord., https://doi.org/10.1186/1471-2474-15-66 (2014).",
    "4. Smedslund, G. & Hagen, K. B. Does rain really cause pain? A systematic review of the associations between weather factors and severity of pain in people with rheumatoid arthritis. Eur. J. Pain. 15, 5–10 (2011).",
    "5. Aikman, H. The association between arthritis and the weather. Int. J. Biometeorol. 40, 192–199 (1997).",
    "6. Brennan, S. A. et al. Influence of weather variables on pain severity in end-stage osteoarthritis. Int. Orthop. 36, 643–646 (2012).",
    "7. Smedslund, G. et al. Does the weather really matter? A cohort study of influences of weather and solar conditions on daily variations of joint pain in patients with rheumatoid arthritis. Arthritis Rheum. 61, 1243–1247 (2009).",
    "8. Bossema, E. R. et al. Influence of weather on daily symptoms of pain and fatigue in female patients with fibromyalgia: a multilevel regression analysis. Arthritis Care Res. (Hoboken) 65, 1019–1025 (2013).",
    "9. Duong, V. et al. Does weather affect daily pain intensity levels in patients with acute low back pain? A prospective cohort study. Rheumatol. Int. 36, 679–684 (2016).",
    "10. Guedj, D. & Weinberger, A. Effect of weather conditions on rheumatic patients. Ann. Rheum. Dis. 49, 158–159 (1990).",
  ],
  "bibtex": "@article{hagglund1994a,\n  author = {Hagglund, K.J.},\n  title = {beliefs about weather, and disease severity among patients with fibromyalgia},\n  journal = {Arthritis Rheum},\n  volume = {7},\n  pages = {130–135},\n  date = {1994},\n  more-authors = {true},\n  language = {}\n}\n@article{timmermans2014a,\n  author = {Timmermans, E.J.},\n  title = {Self-perceived weather sensitivity and joint pain in older people with osteoarthritis in six European countries: Results from the European Project on OSteoArthritis (EPOSA). BMC Musculoskelet},\n  journal = {Disord},\n  url = {https://doi.org/10.1186/1471-2474-15-66},\n  date = {2014},\n  more-authors = {true},\n  language = {}\n}\n@article{smedslund2011a,\n  author = {Smedslund, G. and Hagen, K.B.},\n  title = {Does rain really cause pain? A systematic review of the associations between weather factors and severity of pain in people with rheumatoid arthritis},\n  journal = {Eur. J. Pain},\n  volume = {15},\n  pages = {5–10},\n  date = {2011},\n  language = {}\n}\n@article{aikman1997a,\n  author = {Aikman, H.},\n  title = {The association between arthritis and the weather},\n  journal = {Int. J. Biometeorol},\n  volume = {40},\n  pages = {192–199},\n  date = {1997},\n  language = {}\n}\n@article{brennan2012a,\n  author = {Brennan, S.A.},\n  title = {Influence of weather variables on pain severity in end-stage osteoarthritis},\n  journal = {Int. Orthop},\n  volume = {36},\n  pages = {643–646},\n  date = {2012},\n  more-authors = {true},\n  language = {}\n}\n@article{smedslund2009a,\n  author = {Smedslund, G.},\n  title = {Does the weather really matter? A cohort study of influences of weather and solar conditions on daily variations of joint pain in patients with rheumatoid arthritis},\n  journal = {Arthritis Rheum},\n  volume = {61},\n  pages = {1243–1247},\n  date = {2009},\n  more-authors = {true},\n  language = {}\n}\n@article{bossema2013a,\n  author = {Bossema, E.R.},\n  title = {Influence of weather on daily symptoms of pain and fatigue in female patients with fibromyalgia: a multilevel regression analysis},\n  journal = {Arthritis Care Res. (Hoboken)},\n  volume = {65},\n  pages = {1019–1025},\n  date = {2013},\n  more-authors = {true},\n  language = {}\n}\n@article{duong2016a,\n  author = {Duong, V.},\n  title = {Does weather affect daily pain intensity levels in patients with acute low back pain? A prospective cohort study},\n  journal = {Rheumatol. Int},\n  volume = {36},\n  pages = {679–684},\n  date = {2016},\n  more-authors = {true},\n  language = {}\n}\n@article{guedj1990a,\n  author = {Guedj, D. and Weinberger, A.},\n  title = {Effect of weather conditions on rheumatic patients},\n  journal = {Ann. Rheum. Dis},\n  volume = {49},\n  pages = {158–159},\n  date = {1990},\n  language = {}\n}\n",
  "ris": "TY  - JOUR\nAU  - Hagglund, K.J.\nTI  - beliefs about weather, and disease severity among patients with fibromyalgia\nT2  - Arthritis Rheum\nVL  - 7\nSP  - 130\nEP  - 135\nPY  - 1994\nDA  - 1994\nC1  - true\nLA  - \nER  - \n\nTY  - JOUR\nAU  - Timmermans, E.J.\nTI  - Self-perceived weather sensitivity and joint pain in older people with osteoarthritis in six European countries: Results from the European Project on OSteoArthritis (EPOSA). BMC Musculoskelet\nT2  - Disord\nUR  - https://doi.org/10.1186/1471-2474-15-66\nPY  - 2014\nDA  - 2014\nC1  - true\nLA  - \nER  - \n\nTY  - JOUR\nAU  - Smedslund, G.\nAU  - Hagen, K.B.\nTI  - Does rain really cause pain? A systematic review of the associations between weather factors and severity of pain in people with rheumatoid arthritis\nT2  - Eur. J. Pain\nVL  - 15\nSP  - 5\nEP  - 10\nPY  - 2011\nDA  - 2011\nLA  - \nER  - \n\nTY  - JOUR\nAU  - Aikman, H.\nTI  - The association between arthritis and the weather\nT2  - Int. J. Biometeorol\nVL  - 40\nSP  - 192\nEP  - 199\nPY  - 1997\nDA  - 1997\nLA  - \nER  - \n\nTY  - JOUR\nAU  - Brennan, S.A.\nTI  - Influence of weather variables on pain severity in end-stage osteoarthritis\nT2  - Int. Orthop\nVL  - 36\nSP  - 643\nEP  - 646\nPY  - 2012\nDA  - 2012\nC1  - true\nLA  - \nER  - \n\nTY  - JOUR\nAU  - Smedslund, G.\nTI  - Does the weather really matter? A cohort study of influences of weather and solar conditions on daily variations of joint pain in patients with rheumatoid arthritis\nT2  - Arthritis Rheum\nVL  - 61\nSP  - 1243\nEP  - 1247\nPY  - 2009\nDA  - 2009\nC1  - true\nLA  - \nER  - \n\nTY  - JOUR\nAU  - Bossema, E.R.\nTI  - Influence of weather on daily symptoms of pain and fatigue in female patients with fibromyalgia: a multilevel regression analysis\nT2  - Arthritis Care Res. (Hoboken)\nVL  - 65\nSP  - 1019\nEP  - 1025\nPY  - 2013\nDA  - 2013\nC1  - true\nLA  - \nER  - \n\nTY  - JOUR\nAU  - Duong, V.\nTI  - Does weather affect daily pain intensity levels in patients with acute low back pain? A prospective cohort study\nT2  - Rheumatol. Int\nVL  - 36\nSP  - 679\nEP  - 684\nPY  - 2016\nDA  - 2016\nC1  - true\nLA  - \nER  - \n\nTY  - JOUR\nAU  - Guedj, D.\nAU  - Weinberger, A.\nTI  - Effect of weather conditions on rheumatic patients\nT2  - Ann. Rheum. Dis\nVL  - 49\nSP  - 158\nEP  - 159\nPY  - 1990\nDA  - 1990\nLA  - \nER  - \n\n",
  "reference_links": [
    {
      "id": "2",
      "entry": "2. Hagglund, K. J. et al. Weather, beliefs about weather, and disease severity among patients with fibromyalgia. Arthritis Rheum. 7, 130–135 (1994).",
      "scholar_url": "https://scholar.google.co.uk/scholar?q=Hagglund%2C%20K.J.%20beliefs%20about%20weather%2C%20and%20disease%20severity%20among%20patients%20with%20fibromyalgia%201994",
      "oa_query": "https://ref.scholarcy.com/oa_version?query=Hagglund%2C%20K.J.%20beliefs%20about%20weather%2C%20and%20disease%20severity%20among%20patients%20with%20fibromyalgia%201994"
    },
    {
      "id": "3",
      "entry": "3. Timmermans, E. J. et al. Self-perceived weather sensitivity and joint pain in older people with osteoarthritis in six European countries: Results from the European Project on OSteoArthritis (EPOSA). BMC Musculoskelet. Disord., https://doi.org/10.1186/1471-2474-15-66 (2014).",
      "crossref": "https://dx.doi.org/10.1186/1471-2474-15-66",
      "oa_query": "https://ref.scholarcy.com/oa_version?query=https%3A//dx.doi.org/10.1186/1471-2474-15-66"
    },
    {
      "id": "4",
      "entry": "4. Smedslund, G. & Hagen, K. B. Does rain really cause pain? A systematic review of the associations between weather factors and severity of pain in people with rheumatoid arthritis. Eur. J. Pain. 15, 5–10 (2011).",
      "scholar_url": "https://scholar.google.co.uk/scholar?q=Smedslund%2C%20G.%20Hagen%2C%20K.B.%20Does%20rain%20really%20cause%20pain%3F%20A%20systematic%20review%20of%20the%20associations%20between%20weather%20factors%20and%20severity%20of%20pain%20in%20people%20with%20rheumatoid%20arthritis%202011",
      "oa_query": "https://ref.scholarcy.com/oa_version?query=Smedslund%2C%20G.%20Hagen%2C%20K.B.%20Does%20rain%20really%20cause%20pain%3F%20A%20systematic%20review%20of%20the%20associations%20between%20weather%20factors%20and%20severity%20of%20pain%20in%20people%20with%20rheumatoid%20arthritis%202011"
    },
    {
      "id": "5",
      "entry": "5. Aikman, H. The association between arthritis and the weather. Int. J. Biometeorol. 40, 192–199 (1997).",
      "scholar_url": "https://scholar.google.co.uk/scholar?q=Aikman%2C%20H.%20The%20association%20between%20arthritis%20and%20the%20weather%201997",
      "oa_query": "https://ref.scholarcy.com/oa_version?query=Aikman%2C%20H.%20The%20association%20between%20arthritis%20and%20the%20weather%201997"
    },
    {
      "id": "6",
      "entry": "6. Brennan, S. A. et al. Influence of weather variables on pain severity in end-stage osteoarthritis. Int. Orthop. 36, 643–646 (2012).",
      "scholar_url": "https://scholar.google.co.uk/scholar?q=Brennan%2C%20S.A.%20Influence%20of%20weather%20variables%20on%20pain%20severity%20in%20end-stage%20osteoarthritis%202012",
      "oa_query": "https://ref.scholarcy.com/oa_version?query=Brennan%2C%20S.A.%20Influence%20of%20weather%20variables%20on%20pain%20severity%20in%20end-stage%20osteoarthritis%202012"
    },
    {
      "id": "7",
      "entry": "7. Smedslund, G. et al. Does the weather really matter? A cohort study of influences of weather and solar conditions on daily variations of joint pain in patients with rheumatoid arthritis. Arthritis Rheum. 61, 1243–1247 (2009).",
      "scholar_url": "https://scholar.google.co.uk/scholar?q=Smedslund%2C%20G.%20Does%20the%20weather%20really%20matter%3F%20A%20cohort%20study%20of%20influences%20of%20weather%20and%20solar%20conditions%20on%20daily%20variations%20of%20joint%20pain%20in%20patients%20with%20rheumatoid%20arthritis%202009",
      "oa_query": "https://ref.scholarcy.com/oa_version?query=Smedslund%2C%20G.%20Does%20the%20weather%20really%20matter%3F%20A%20cohort%20study%20of%20influences%20of%20weather%20and%20solar%20conditions%20on%20daily%20variations%20of%20joint%20pain%20in%20patients%20with%20rheumatoid%20arthritis%202009"
    },
    {
      "id": "8",
      "entry": "8. Bossema, E. R. et al. Influence of weather on daily symptoms of pain and fatigue in female patients with fibromyalgia: a multilevel regression analysis. Arthritis Care Res. (Hoboken) 65, 1019–1025 (2013).",
      "scholar_url": "https://scholar.google.co.uk/scholar?q=Bossema%2C%20E.R.%20Influence%20of%20weather%20on%20daily%20symptoms%20of%20pain%20and%20fatigue%20in%20female%20patients%20with%20fibromyalgia%3A%20a%20multilevel%20regression%20analysis%202013",
      "oa_query": "https://ref.scholarcy.com/oa_version?query=Bossema%2C%20E.R.%20Influence%20of%20weather%20on%20daily%20symptoms%20of%20pain%20and%20fatigue%20in%20female%20patients%20with%20fibromyalgia%3A%20a%20multilevel%20regression%20analysis%202013"
    },
    {
      "id": "9",
      "entry": "9. Duong, V. et al. Does weather affect daily pain intensity levels in patients with acute low back pain? A prospective cohort study. Rheumatol. Int. 36, 679–684 (2016).",
      "scholar_url": "https://scholar.google.co.uk/scholar?q=Duong%2C%20V.%20Does%20weather%20affect%20daily%20pain%20intensity%20levels%20in%20patients%20with%20acute%20low%20back%20pain%3F%20A%20prospective%20cohort%20study%202016",
      "oa_query": "https://ref.scholarcy.com/oa_version?query=Duong%2C%20V.%20Does%20weather%20affect%20daily%20pain%20intensity%20levels%20in%20patients%20with%20acute%20low%20back%20pain%3F%20A%20prospective%20cohort%20study%202016"
    },
    {
      "id": "10",
      "entry": "10. Guedj, D. & Weinberger, A. Effect of weather conditions on rheumatic patients. Ann. Rheum. Dis. 49, 158–159 (1990).",
      "scholar_url": "https://scholar.google.co.uk/scholar?q=Guedj%2C%20D.%20Weinberger%2C%20A.%20Effect%20of%20weather%20conditions%20on%20rheumatic%20patients%201990",
      "oa_query": "https://ref.scholarcy.com/oa_version?query=Guedj%2C%20D.%20Weinberger%2C%20A.%20Effect%20of%20weather%20conditions%20on%20rheumatic%20patients%201990"
    }
  ]
}

This endpoint extracts references from a remote URL, returning JSON data with embedded RIS and BibTeX content. The remote URL can resolve to a document type in any of the formats listed for the POST endpoint.

HTTP Request

GET http://ref.scholarcy.com/api/references/extract

Query Parameters

Parameter Default Description
url null URL of public, open-access document.
document_type full_paper full_paper: a complete research paper, chapter or thesis. bibliography: a file containing just bibliographic items.
references null Optional. If you don't want to upload a file, you can pass a text string containing line-delimited references that you wish to parse into the desired output format.
resolve_references true If true, parse each reference into BibTeX and RIS data and create link resolvers for each reference.
reference_style ensemble Referencing style used by the document. If unsure, use the default ensemble or choose experimental. Other options include: acs, ama, apa, chicago, harvard, ieee, mhra, mla, nature, vancouver.
engine v1 PDF processing engine. v1: uses the XPDF tool and works best for most PDFs. v2: uses the poppler tool and may work better for PDFs with marginal line numbering, with multiple columns, or for those PDFs where v1 fails to extract useful information.

Extract Key Terms

The API endpoints at https://api.scholarcy.com/api/keywords/extract will pull out the key terms from an article.

POST a local file to extract key terms

require 'rest-client'
AUTH_TOKEN = 'abcdef' # Your API key
API_DOMAIN = 'https://api.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/api/keywords/extract'
headers = {"Authorization": "Bearer " + AUTH_TOKEN}
file_path = '/path/to/local/file.pdf'

request = RestClient::Request.new(
          :method => :post,
          :url => POST_ENDPOINT,
          :headers => headers,
          :payload => {
            :multipart => true,
            :file => File.new(file_path, 'rb'),
            :start_page => 24,
            :end_page => 37
          })
response = request.execute
puts(response.body)
import requests
timeout = 30

AUTH_TOKEN = 'abcdefg' # Your API key
API_DOMAIN = 'https://api.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/api/keywords/extract'
headers = {'Authorization': 'Bearer ' + AUTH_TOKEN}

file_path = '/path/to/local/file.pdf'

params = {'start_page': 24, 'end_page': 37}
with open(file_path, 'rb') as file_data:
    file_payload = {'file': file_data}
    r = requests.post(POST_ENDPOINT,
          headers=headers,
          files=file_payload,
          data=params,
          timeout=timeout)
    print(r.json())

curl "https://api.scholarcy.com/api/keywords/extract" \
  -H "Authorization: Bearer abcdefg" \
  -F "file=@/path/to/local/file.pdf" \
  -F "start_page=24" \
  -F "end_page=37"

The above command returns JSON structured like this:

{
  "filename": "article2016.pdf",
  "abbreviations": {
    "U&G": "uses and gratification",
    "SNS": "social networking sites",
    "COBRA": "Consumer Online Brand-Related Activity",
    "AVE": "average variance extracted",
    "CLF": "common latent factor"
  },
  "keywords": [
    {
      "term": "Facebook",
      "url": "https://en.wikipedia.org/wiki/Facebook"
    },
    {
      "term": "typology",
      "url": "https://en.wikipedia.org/wiki/typology"
    },
    {
      "term": "social media",
      "url": "https://en.wikipedia.org/wiki/social_media"
    },
    {
      "term": "consumer behavior",
      "url": "https://en.wikipedia.org/wiki/consumer_behavior"
    },
    {
      "term": "average variance extracted",
      "url": "https://en.wikipedia.org/wiki/average_variance_extracted"
    },
    {
      "term": "online branding",
      "url": "https://en.wikipedia.org/wiki/online_branding"
    },
    {
      "term": "cluster analysis",
      "url": "https://en.wikipedia.org/wiki/cluster_analysis"
    },
    {
      "term": "social networking sites",
      "url": "https://en.wikipedia.org/wiki/social_networking_sites"
    },
    {
      "term": "brand manager",
      "url": "https://en.wikipedia.org/wiki/brand_manager"
    }
  ],
  "keyword_relevance": {
    "Facebook": 0.3834808259587021,
    "social media": 0.16224188790560473,
    "social networking sites": 0.08259587020648967,
    "brand interaction": 0.07964601769911504,
    "typology": 0.038348082595870206,
    "brand manager": 0.038348082595870206,
    "uses and gratification": 0.035398230088495575,
    "consumer interaction": 0.032448377581120944,
    "consumer behavior": 0.02359882005899705,
    "brand communication": 0.02064896755162242,
    "cluster analysis": 0.017699115044247787,
    "main motivation": 0.017699115044247787,
    "Consumer Online Brand-Related Activity": 0.014749262536873156,
    "average variance extracted": 0.011799410029498525,
    "common latent factor": 0.008849557522123894,
    "online branding": 0.008849557522123894,
  }
}

The above command can also returns CSV structured like this:

"filename","key term","wikipedia_link"
"article2016.pdf","Facebook","https://en.wikipedia.org/wiki/Facebook"
"article2016.pdf","typology","https://en.wikipedia.org/wiki/typology"
"article2016.pdf","social media","https://en.wikipedia.org/wiki/social_media"
"article2016.pdf","consumer behavior","https://en.wikipedia.org/wiki/consumer_behavior"
"article2016.pdf","average variance extracted","https://en.wikipedia.org/wiki/average_variance_extracted"
"article2016.pdf","online branding","https://en.wikipedia.org/wiki/online_branding"
"article2016.pdf","cluster analysis","https://en.wikipedia.org/wiki/cluster_analysis"
"article2016.pdf","social networking sites","https://en.wikipedia.org/wiki/social_networking_sites"
"article2016.pdf","brand manager","https://en.wikipedia.org/wiki/brand_manager"

This endpoint extracts key terms from a local file. File formats supported are:

HTTP Request

POST http://api.scholarcy.com/api/keywords/extract

Query Parameters

Parameter Default Description
file null A file object.
url null URL of public, open-access document.
text null Plain text content to be processed.
start_page 1 Start reading the document from this page (PDF urls only). Useful for processing a single article/chapter within a larger file.
end_page null Stop reading the document from this page (PDF urls only). Useful for processing a single article/chapter within a larger file.
external_metadata false If true, fetch article metadata from the relevant remote repository (e.g. CrossRef).
wiki_links false If true, map extracted key terms to their Wikipedia pages
sampling representative For large documents, when extracting key terms, use either a representative sample of the full content, or the fulltext content.
extract_snippets true If true, sample snippets from each section, otherwise, sample the full text.
output_format json json or CSV.

GET key terms from a URL

require 'rest-client'
AUTH_TOKEN = 'abcdef' # Your API key
API_DOMAIN = 'https://api.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/api/keywords/extract'
headers = {"Authorization": "Bearer " + AUTH_TOKEN}

request = RestClient::Request.new(
          :method => :get,
          :url => POST_ENDPOINT,
          :headers => headers,
          :payload => {
            :url => 'https://www.nature.com/articles/s41746-019-0180-3',
            :start_page => 1
          })
response = request.execute
puts(response.body)

import requests
timeout = 30

AUTH_TOKEN = 'abcdefg' # Your API key
API_DOMAIN = 'https://api.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/api/keywords/extract'
headers = {'Authorization': 'Bearer ' + AUTH_TOKEN}

payload = {
'url': 'https://www.nature.com/articles/s41746-019-0180-3',
'start_page': 1
}
r = requests.get(POST_ENDPOINT,
      headers=headers,
      params=payload,
      timeout=timeout)
print(r.json())

curl "https://api.scholarcy.com/api/keywords/extract" \
  -H "Authorization: Bearer abcdefg" \
  -d "url=https://www.nature.com/articles/s41746-019-0180-3" \
  -d "start_page=1"

The above command returns JSON structured as for the POST endpoint:

The above command returns JSON structured like this:

{
  "filename": "article2016.pdf",
  "abbreviations": {
    "U&G": "uses and gratification",
    "SNS": "social networking sites",
    "COBRA": "Consumer Online Brand-Related Activity",
    "AVE": "average variance extracted",
    "CLF": "common latent factor"
  },
  "keywords": [
    {
      "term": "Facebook",
      "url": "https://en.wikipedia.org/wiki/Facebook"
    },
    {
      "term": "typology",
      "url": "https://en.wikipedia.org/wiki/typology"
    },
    {
      "term": "social media",
      "url": "https://en.wikipedia.org/wiki/social_media"
    },
    {
      "term": "consumer behavior",
      "url": "https://en.wikipedia.org/wiki/consumer_behavior"
    },
    {
      "term": "average variance extracted",
      "url": "https://en.wikipedia.org/wiki/average_variance_extracted"
    },
    {
      "term": "online branding",
      "url": "https://en.wikipedia.org/wiki/online_branding"
    },
    {
      "term": "cluster analysis",
      "url": "https://en.wikipedia.org/wiki/cluster_analysis"
    },
    {
      "term": "social networking sites",
      "url": "https://en.wikipedia.org/wiki/social_networking_sites"
    },
    {
      "term": "brand manager",
      "url": "https://en.wikipedia.org/wiki/brand_manager"
    }
  ],
  "keyword_relevance": {
    "Facebook": 0.3834808259587021,
    "social media": 0.16224188790560473,
    "social networking sites": 0.08259587020648967,
    "brand interaction": 0.07964601769911504,
    "typology": 0.038348082595870206,
    "brand manager": 0.038348082595870206,
    "uses and gratification": 0.035398230088495575,
    "consumer interaction": 0.032448377581120944,
    "consumer behavior": 0.02359882005899705,
    "brand communication": 0.02064896755162242,
    "cluster analysis": 0.017699115044247787,
    "main motivation": 0.017699115044247787,
    "Consumer Online Brand-Related Activity": 0.014749262536873156,
    "average variance extracted": 0.011799410029498525,
    "common latent factor": 0.008849557522123894,
    "online branding": 0.008849557522123894,
  }
}

The above command can also returns CSV structured like this:

"filename","key term","wikipedia_link"
"article2016.pdf","Facebook","https://en.wikipedia.org/wiki/Facebook"
"article2016.pdf","typology","https://en.wikipedia.org/wiki/typology"
"article2016.pdf","social media","https://en.wikipedia.org/wiki/social_media"
"article2016.pdf","consumer behavior","https://en.wikipedia.org/wiki/consumer_behavior"
"article2016.pdf","average variance extracted","https://en.wikipedia.org/wiki/average_variance_extracted"
"article2016.pdf","online branding","https://en.wikipedia.org/wiki/online_branding"
"article2016.pdf","cluster analysis","https://en.wikipedia.org/wiki/cluster_analysis"
"article2016.pdf","social networking sites","https://en.wikipedia.org/wiki/social_networking_sites"
"article2016.pdf","brand manager","https://en.wikipedia.org/wiki/brand_manager"

This endpoint extracts key terms from a remote URL. The remote URL can resolve to a document type of any of the formats listed for the POST endpoint

HTTP Request

GET http://api.scholarcy.com/api/keywords/extract

Query Parameters

Parameter Default Description
url null URL of public, open-access document.
text null Plain text content to be processed.
start_page 1 Start reading the document from this page (PDF urls only). Useful for processing a single article/chapter within a larger file.
end_page null Stop reading the document from this page (PDF urls only). Useful for processing a single article/chapter within a larger file.
external_metadata false If true, fetch article metadata from the relevant remote repository (e.g. CrossRef).
wiki_links false If true, map extracted key terms to their Wikipedia pages
sampling representative For large documents, when extracting key terms, use either a representative sample of the full content, or the fulltext content.
extract_snippets true If true, sample snippets from each section, otherwise, sample the full text.
output_format json json or CSV.

Generate a Synopsis

The API endpoints at https://summarizer.scholarcy.com/summarize will generate a short, abstractive synopsis (70-100 words) or a mini-review (around 150-300 words), depending on the parameters chosen.

By default, output is in JSON format.

Alternatively, you can receive output in HTML format if you pass an Accept: text/html header with your request.

POST a local file to generate a synopsis

require 'rest-client'
AUTH_TOKEN = 'abcdef' # Your API key
API_DOMAIN = 'https://summarizer.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/summarize'
headers = {"Authorization": "Bearer " + AUTH_TOKEN}
file_path = '/path/to/local/file.pdf'

request = RestClient::Request.new(
          :method => :post,
          :url => POST_ENDPOINT,
          :headers => headers,
          :payload => {
            :multipart => true,
            :file => File.new(file_path, 'rb'),
            :wiki_links => true,
            :format_summary => true
          })
response = request.execute
puts(response.body)
import requests
timeout = 30

AUTH_TOKEN = 'abcdefg' # Your API key
API_DOMAIN = 'https://summarizer.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/summarize'
headers = {'Authorization': 'Bearer ' + AUTH_TOKEN}

file_path = '/path/to/local/file.pdf'

params = {'wiki_links': True, 'format_summary': True}
with open(file_path, 'rb') as file_data:
    file_payload = {'file': file_data}
    r = requests.post(POST_ENDPOINT,
          headers=headers,
          files=file_payload,
          data=params,
          timeout=timeout)
    print(r.json())

curl "https://summarizer.scholarcy.com/summarize" \
  -H "Authorization: Bearer abcdefg" \
  -F "file=@/path/to/local/file.pdf" \
  -F "wiki_links=true" \
  -F "format_summary=true"

The above command returns JSON structured like this:

{
    "response": {
        "abbreviations": {
            "EPOSA": "European Project on OSteoArthritis",
            "GPS": "Global Positioning System",
            "ISD": "Integrated Surface Database",
            "OR": "odds ratio"
        },
        "headline": "Researchers have used smartphone data to investigate the relationship between pain and weather conditions, and found that there is a small but significant relationship.",
        "keywords": [
            {
                "term": "physical activity",
                "url": "https://en.wikipedia.org/wiki/physical_activity"
            },
            {
                "term": "osteoarthritis",
                "url": "https://en.wikipedia.org/wiki/osteoarthritis"
            },
            {
                "term": "atmospheric pressure",
                "url": "https://en.wikipedia.org/wiki/atmospheric_pressure"
            },
            {
                "term": "rheumatoid arthritis",
                "url": "https://en.wikipedia.org/wiki/rheumatoid_arthritis"
            },
            {
                "term": "Global Positioning System",
                "url": "https://en.wikipedia.org/wiki/Global_Positioning_System"
            },
            {
                "term": "fibromyalgia",
                "url": "https://en.wikipedia.org/wiki/fibromyalgia"
            },
            {
                "term": "arthritis",
                "url": "https://en.wikipedia.org/wiki/arthritis"
            },
            {
                "term": "smartphone app",
                "url": "https://en.wikipedia.org/wiki/smartphone_app"
            },
            {
                "term": "relative humidity",
                "url": "https://en.wikipedia.org/wiki/relative_humidity"
            },
            {
                "term": "chronic pain",
                "url": "https://en.wikipedia.org/wiki/chronic_pain"
            },
            {
                "term": "odds ratio",
                "url": "https://en.wikipedia.org/wiki/odds_ratio"
            },
            {
                "term": "Parkinson disease",
                "url": "https://en.wikipedia.org/wiki/Parkinson_disease"
            },
            {
                "term": "joint pain",
                "url": "https://en.wikipedia.org/wiki/joint_pain"
            },
            {
                "term": "wind speed",
                "url": "https://en.wikipedia.org/wiki/wind_speed"
            },
            {
                "term": "cohort study",
                "url": "https://en.wikipedia.org/wiki/cohort_study"
            }
        ],
        "message": "",
        "metadata": {
            "citation": "William G. Dixon, Anna L. Beukenhorst, Belay B. Yimer, Louise Cook, Antonio Gasparrini, Tal El-Hay, Bruce Hellman, Ben James, Ana M. Vicedo-Cabrera, Malcolm Maclure, Ricardo Silva, John Ainsworth, Huai Leng Pisaniello, Thomas House, Mark Lunt, Carolyn Gamble, Caroline Sanders, David M. Schultz, Jamie C. Sergeant, John McBeth (2019). How the weather affects the pain of citizen scientists using a smartphone app. npj Digital Medicine 2. https://www.nature.com/articles/s41746-019-0180-3",
            "citation_affiliation": "",
            "citation_author": "William G. Dixon et al.",
            "citation_date": 2019,
            "citation_title": "How the weather affects the pain of citizen scientists using a smartphone app",
            "citation_url": "https://www.nature.com/articles/s41746-019-0180-3"
        },
        "readership_level": "technical-readership-accurate",
        "summary": "<a class=\"has-tooltip\" title=\"Read the article\" target=\"_blank\" href=\"https://www.nature.com/articles/s41746-019-0180-3\">William Dixon et al. (2019)</a> studied how the weather affects the pain of citizen scientists using a smartphone app. Weather has been thought to affect symptoms in patients with chronic disease since the time of Hippocrates over 2000 years ago.\nMultivariable case-crossover analysis including the four state weather variables demonstrated that an increase in relative humidity was associated with a higher odds of a pain event with an OR of 1.139 (95% confidence interval 1.099\u20131.181) per 10 percentage point increase.\nThis study has demonstrated that higher relative humidity and wind speed, and lower atmospheric pressure, were associated with increased pain severity in people with long-term pain conditions.\nThe \u2018worst\u2019 combination of weather variables would increase the odds of a pain event by just over 20% compared to an average day.<br/><br/>There were 2658 patients involved in the research. Discussing potential improvements, \u201cThere are potential limitations to this study.\nIt is possible only people with a strong belief in a weather\u2013pain relationship participated.\nRain and cold weather were the most common pre-existing beliefs, authors say,\u201d they admit. ",
        "title": "How the weather affects the pain of citizen scientists using a smartphone app"
    }
}

This endpoint generates a synopsis from a local file. File formats supported are:

HTTP Request

POST http://summarizer.scholarcy.com/summarize

Query Parameters

Parameter Default Description
file null A file object.
url null URL of public, open-access document. Can be a DOI but must be qualified with a resolver domain, e.g. https://doi.org/10.1177/0846537120913497
input_text null You can pass a text string directly to the endpoint, instead of uploading a file or passing a URL.
structured_summary false Take the document structure into account, considering specific sections such as Introduction, Background, Methods, Results, Discussion, Conclusion.
summary_type combined Level of detail of summary: overview: abstractive synopsis of Scholarcy highlights. detail: abstractive synopsis of Scholarcy summary. combined: union of overview and detail. merged: an abstractive synopsis of the union of Scholarcy highlights and Scholarcy summary.
focus_level 4 This internal hyperparameter controls whether the summary takes a narrow focus on a specific fact or a wider focus on multiple facts within the source. 4: wide focus. 3: medium focus. 2: narrow focus. 1: narrowest focus
readership_level technical-readership- accurate This controls the level of language complexity and amount of paraphrasing in the output. technical-readership-accurate: output is for a technical/academic reader with a high level of factual accuracy in relation to the source text. technical-readership-fast: output is for a technical/academic reader and provides a little more paraphrasing, which may result in a slight loss in accuracy. However, it is 2x faster than technical-readership-accurate. lay-readership-accurate: output is for a lay/non-expert reader, with moderate paraphrasing and good level of accuracy in relation to the source text. lay-readership-fast: output is for a lay/non- expert reader, with much paraphrasing and reasonable level of accuracy in relation to the source text. However, it is 2x faster than lay-readership-accurate.
wiki_links false Map extracted key terms to Wikipedia entries.
format_summary false Format the summary so it can be more easily used as part of a referenced report: 1) Personal pronouns referring to the authors are replaced with the author names. 2) The summary is correctly cited with author and date. 3) A formatted reference to the source is generated
headline_type verbatim Determines how the headline is generated. verbatim (default): uses the main finding extracted directly from the paper. The other options are as for readership_level, i.e. technical-readership-accurate, technical-readership-fast, lay-readership-accurate and lay-readership-fast. If format_summary is true, then headline_type defaults to lay-readership-accurate unless otherwise specified.

GET a synopsis from a URL

require 'rest-client'
AUTH_TOKEN = 'abcdef' # Your API key
API_DOMAIN = 'https://summarizer.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/summarize'
headers = {"Authorization": "Bearer " + AUTH_TOKEN}

request = RestClient::Request.new(
          :method => :get,
          :url => POST_ENDPOINT,
          :headers => headers,
          :payload => {
            :url => 'https://www.nature.com/articles/s41746-019-0180-3',
            :wiki_links => true,
            :format_summary => true
          })
response = request.execute
puts(response.body)

import requests
timeout = 30

AUTH_TOKEN = 'abcdefg' # Your API key
API_DOMAIN = 'https://summarizer.scholarcy.com'
POST_ENDPOINT = API_DOMAIN + '/summarize'
headers = {'Authorization': 'Bearer ' + AUTH_TOKEN}

payload = {
'url': 'https://www.nature.com/articles/s41746-019-0180-3',
'wiki_links': True,
'format_summary': True
}
r = requests.get(POST_ENDPOINT,
      headers=headers,
      params=payload,
      timeout=timeout)
print(r.json())

curl "https://summarizer.scholarcy.com/summarize" \
  -H "Authorization: Bearer abcdefg" \
  -d "url=https://www.nature.com/articles/s41746-019-0180-3" \
  -d "wiki_links=true" \
  -d "format_summary=true"

The above command returns JSON structured as for the POST endpoint:

{
    "response": {
        "abbreviations": {
            "EPOSA": "European Project on OSteoArthritis",
            "GPS": "Global Positioning System",
            "ISD": "Integrated Surface Database",
            "OR": "odds ratio"
        },
        "headline": "Researchers have used smartphone data to investigate the relationship between pain and weather conditions, and found that there is a small but significant relationship.",
        "keywords": [
            {
                "term": "physical activity",
                "url": "https://en.wikipedia.org/wiki/physical_activity"
            },
            {
                "term": "osteoarthritis",
                "url": "https://en.wikipedia.org/wiki/osteoarthritis"
            },
            {
                "term": "atmospheric pressure",
                "url": "https://en.wikipedia.org/wiki/atmospheric_pressure"
            },
            {
                "term": "rheumatoid arthritis",
                "url": "https://en.wikipedia.org/wiki/rheumatoid_arthritis"
            },
            {
                "term": "Global Positioning System",
                "url": "https://en.wikipedia.org/wiki/Global_Positioning_System"
            },
            {
                "term": "fibromyalgia",
                "url": "https://en.wikipedia.org/wiki/fibromyalgia"
            },
            {
                "term": "arthritis",
                "url": "https://en.wikipedia.org/wiki/arthritis"
            },
            {
                "term": "smartphone app",
                "url": "https://en.wikipedia.org/wiki/smartphone_app"
            },
            {
                "term": "relative humidity",
                "url": "https://en.wikipedia.org/wiki/relative_humidity"
            },
            {
                "term": "chronic pain",
                "url": "https://en.wikipedia.org/wiki/chronic_pain"
            },
            {
                "term": "odds ratio",
                "url": "https://en.wikipedia.org/wiki/odds_ratio"
            },
            {
                "term": "Parkinson disease",
                "url": "https://en.wikipedia.org/wiki/Parkinson_disease"
            },
            {
                "term": "joint pain",
                "url": "https://en.wikipedia.org/wiki/joint_pain"
            },
            {
                "term": "wind speed",
                "url": "https://en.wikipedia.org/wiki/wind_speed"
            },
            {
                "term": "cohort study",
                "url": "https://en.wikipedia.org/wiki/cohort_study"
            }
        ],
        "message": "",
        "metadata": {
            "citation": "William G. Dixon, Anna L. Beukenhorst, Belay B. Yimer, Louise Cook, Antonio Gasparrini, Tal El-Hay, Bruce Hellman, Ben James, Ana M. Vicedo-Cabrera, Malcolm Maclure, Ricardo Silva, John Ainsworth, Huai Leng Pisaniello, Thomas House, Mark Lunt, Carolyn Gamble, Caroline Sanders, David M. Schultz, Jamie C. Sergeant, John McBeth (2019). How the weather affects the pain of citizen scientists using a smartphone app. npj Digital Medicine 2. https://www.nature.com/articles/s41746-019-0180-3",
            "citation_affiliation": "",
            "citation_author": "William G. Dixon et al.",
            "citation_date": 2019,
            "citation_title": "How the weather affects the pain of citizen scientists using a smartphone app",
            "citation_url": "https://www.nature.com/articles/s41746-019-0180-3"
        },
        "readership_level": "technical-readership-accurate",
        "summary": "<a class=\"has-tooltip\" title=\"Read the article\" target=\"_blank\" href=\"https://www.nature.com/articles/s41746-019-0180-3\">William Dixon et al. (2019)</a> studied how the weather affects the pain of citizen scientists using a smartphone app. Weather has been thought to affect symptoms in patients with chronic disease since the time of Hippocrates over 2000 years ago.\nMultivariable case-crossover analysis including the four state weather variables demonstrated that an increase in relative humidity was associated with a higher odds of a pain event with an OR of 1.139 (95% confidence interval 1.099\u20131.181) per 10 percentage point increase.\nThis study has demonstrated that higher relative humidity and wind speed, and lower atmospheric pressure, were associated with increased pain severity in people with long-term pain conditions.\nThe \u2018worst\u2019 combination of weather variables would increase the odds of a pain event by just over 20% compared to an average day.<br/><br/>There were 2658 patients involved in the research. Discussing potential improvements, \u201cThere are potential limitations to this study.\nIt is possible only people with a strong belief in a weather\u2013pain relationship participated.\nRain and cold weather were the most common pre-existing beliefs, authors say,\u201d they admit. ",
        "title": "How the weather affects the pain of citizen scientists using a smartphone app"
    }
}

This endpoint generates synopsis from a remote URL. The remote URL can resolve to a document type of any of the formats listed for the POST endpoint

HTTP Request

GET http://summarizer.scholarcy.com/summarize

Query Parameters

Parameter Default Description
url null URL of public, open-access document. Can be a DOI but must be qualified with a resolver domain, e.g. https://doi.org/10.1177/0846537120913497
input_text null You can pass a text string directly to the endpoint, instead of uploading a file or passing a URL.
structured_summary false Take the document structure into account, considering specific sections such as Introduction, Background, Methods, Results, Discussion, Conclusion.
summary_type combined Level of detail of summary: overview: abstractive synopsis of Scholarcy highlights. detail: abstractive synopsis of Scholarcy summary. combined: union of overview and detail. merged: an abstractive synopsis of the union of Scholarcy highlights and Scholarcy summary.
focus_level 4 This internal hyperparameter controls whether the summary takes a narrow focus on a specific fact or a wider focus on multiple facts within the source. 4: wide focus. 3: medium focus. 2: narrow focus. 1: narrowest focus
readership_level technical-readership- accurate This controls the level of language complexity and amount of paraphrasing in the output. technical-readership-accurate: output is for a technical/academic reader with a high level of factual accuracy in relation to the source text. technical-readership-fast: output is for a technical/academic reader and provides a little more paraphrasing, which may result in a slight loss in accuracy. However, it is 2x faster than technical-readership-accurate. lay-readership-accurate: output is for a lay/non-expert reader, with moderate paraphrasing and good level of accuracy in relation to the source text. lay-readership-fast: output is for a lay/non- expert reader, with much paraphrasing and reasonable level of accuracy in relation to the source text. However, it is 2x faster than lay-readership-accurate.
wiki_links false Map extracted key terms to Wikipedia entries.
format_summary false Format the summary into a 'mini review' so it can be more easily used as the basis of a referenced report: 1) Personal pronouns referring to the authors are replaced with the author names. 2) The summary is correctly cited with author and date. 3) A formatted reference to the source is generated.
headline_type verbatim Determines how the headline is generated. verbatim (default): uses the main finding extracted directly from the paper. The other options are as for readership_level, i.e. technical-readership-accurate, technical-readership-fast, lay-readership-accurate and lay-readership-fast. If format_summary is true, then headline_type defaults to lay-readership-accurate unless otherwise specified.

Errors

The Scholarcy API uses the following error codes:

Error Code Meaning
400 Bad Request -- Your request is invalid.
401 Unauthorized -- Your API key is wrong.
403 Forbidden -- The API endpoint requested is hidden for administrators only.
404 Not Found -- The API endpoint could not be found.
405 Method Not Allowed -- You tried to call the API with an invalid method.
406 Not Acceptable -- You requested a format that isn't JSON.
429 Too Many Requests -- You're making too many API requests.
500 Internal Server Error -- We had a problem with our server. Try again later.
503 Service Unavailable -- We're temporarily offline for maintenance. Please try again later.
504 Gateway Timeout -- Serving your request took longer than expected. Please try again.