Skip to content

Commit 4424944

Browse files
Improve documentation
1 parent 2ac3224 commit 4424944

22 files changed

+219
-219
lines changed

README.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
pybliometrics
22
=============
33

4-
Access Elsevier Scopus's API from Python on a large scale.
4+
Enables large-scale access to Elsevier's Scopus API from Python.
55

66
Documentation: https://pybliometrics.readthedocs.io
77

docs/access/errors.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -23,11 +23,11 @@ Error message hierarchy
2323

2424
* `pybliometrics.scopus.exception.Scopus429Error: QUOTA EXCEEDED`: Your provided API key's weekly quota has been depleted. If you provided multiple keys in your :doc:`configuration file <../configuration>`, this means all your keys are depleted. In this case, wait up to week until your API key's quota has been reset.
2525

26-
* `pybliometrics.scopus.exception.ScopusServerError`: General exception related to all Server-related exceptions defined below. This may happen for various reasons (the internet is a noisy medium); usually it helps to wait few seconds before the next query. Server errors are also raised if in searches you use a fieldname that does not exist. Verify that your query works in Scopus' `Advanced Search <https://www.scopus.com/search/form.uri?display=advanced>`_. Previously `pybliometrics` used more finegrained exceptions in the 5xx space, namely "Scopus500Error", "Scopus502Error" and "Scopus504Error". These are deprecated, use "ScopusServerError" instead.
26+
* `pybliometrics.scopus.exception.ScopusServerError`: General exception related to all Server-related exceptions defined below. This may happen for various reasons (the internet is a noisy medium); usually it helps to wait few seconds before the next query. Server errors are also raised if you use a non-existent fieldname in searches. Verify that your query works in Scopus' `Advanced Search <https://www.scopus.com/search/form.uri?display=advanced>`_. Previously `pybliometrics` used more fine-grained exceptions in the 5xx space, namely "Scopus500Error", "Scopus502Error" and "Scopus504Error". These are deprecated, use "ScopusServerError" instead.
2727

2828
If queries break for other reasons, exceptions of type `requests.exceptions <https://requests.readthedocs.io/en/latest/api/?highlight=exceptions#exceptions>`_ are raised, such as:
2929

3030
`requests.exceptions.TooManyRedirects: Exceeded 30 redirects.`
31-
The entity you are looking for was not properly merged with another one entity in the sense that it is not forwarding. Happens rarely when Scopus Author profiles are merged. May also occur less often with Abstract EIDs and Affiliation IDs.
31+
The entity you are looking for was not properly merged with another entity, in the sense that it is not correctly forwarding. Happens rarely when Scopus Author profiles are merged. May also occur less often with Abstract EIDs and Affiliation IDs.
3232

33-
`pybliometrics` will retry to establish the connection a few times on typical server-side errors. How often is specified in your :doc:`configuration file <../configuration>`, section "Requests" value "Retries" (if none is given, `pybliometrics` makes 5 attempts).
33+
`pybliometrics` will automatically try to establish the connection a few times on typical server-side errors. The number of retries is specified in your :doc:`configuration file <../configuration>`, section "Requests" value "Retries" (if none is given, `pybliometrics` makes 5 attempts).

docs/access/general.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,15 +5,15 @@ To access Scopus via its API, you need two things. First, your institution need
55

66
The Scopus API recognizes you as a member of your institution via IP range. For working from home, Scopus can also grant InstTokens. Thus one of three things needs to happen:
77

8-
1. You are in your instition's network
9-
2. You use your instition's VPN
8+
1. You are in your institution's network
9+
2. You use your institution's VPN
1010
3. You use an InstToken
1111

1212
Option 1 is easy and the most common.
1313

1414
Option 2 might require you to additionally set a proxy. You can do so in the :doc:`configuration file <../configuration>`.
1515

16-
Option 3 is rare. If you have an InstToken, please provide it during the setup when `pybliometrics` prompts you for it. Alternatively, add it to the :doc:`configuration file <../configuration>` manually. You may also set the InstToken via `insttoken="XYZ"` in any class. This is the preferred solution if you possess multiple keys.
16+
Option 3 is rare. An InstToken is provided directly by Scopus/Elsevier to allow remote access in the absence of a VPN. It is cupled directly to a key. If you have an InstToken, please provide it during the setup when `pybliometrics` prompts you for it. Alternatively, add it to the :doc:`configuration file <../configuration>` manually. You may also set the InstToken via `insttoken="XYZ"` in any class. This is the preferred solution if you possess multiple keys.
1717

1818
There are only three Scopus APIs that you can access without your institution subscribing to it: The Abstract Retrieval API, the Scopus Search API and the Subject Classifications API.
1919

docs/access/quotas.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
API Key quotas and 429 error
22
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
33

4-
Each API key has a certain usage limit for different Scopus APIs. See https://dev.elsevier.com/api_key_settings.html for the list; for example, a key allows for 5,000 retrieval requests, or 20,000 search requests via the Scopus Search API.
4+
Each API key has a certain usage limit for different Scopus APIs which are reset weekly. See https://dev.elsevier.com/api_key_settings.html for the list; for example, a key allows for 5,000 retrieval requests, or 20,000 search requests via the Scopus Search API.
55

6-
One week after the first usage, Scopus resets the key.
6+
The usage limits for each key are reset weekly, one week after their first usage. To this end, each class has two methods that can help you: `.get_key_remaining_quota()` tells you how many calls you have left with the current key for the last used API. `.get_key_reset_time()` tells you the time until reset.
77

88
`pybliometrics` will use all the keys provided in the :doc:`configuration file <../configuration>` when one key exceeded its quota for the given API. Be sure to put all keys in the config.ini.
99

docs/classes/AbstractRetrieval.rst

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,9 @@ pybliometrics.scopus.AbstractRetrieval
33

44
`AbstractRetrieval()` implements the `Scopus Abstract Retrieval API <https://dev.elsevier.com/documentation/AbstractRetrievalAPI.wadl>`_.
55

6-
It takes any identifier as main argument: Most of the time it will be a `Scopus EID <http://kitchingroup.cheme.cmu.edu/blog/2015/06/07/Getting-a-Scopus-EID-from-a-DOI/>`_, but DOI, Scopus ID (the last part of the EID), PubMed identifier or Publisher Item Identifier (PII) work as well. `AbstractRetrieval` tries to infer the class itself - to speed this up you can tell the ID type via `ID_type`.
6+
It accepts any identifier as the main argument. Most commonly, this will be a `Scopus EID <http://kitchingroup.cheme.cmu.edu/blog/2015/06/07/Getting-a-Scopus-EID-from-a-DOI/>`_, but DOI, Scopus ID (the last part of the EID), PubMed identifier or Publisher Item Identifier (PII) work as well. `AbstractRetrieval` tries to infer the class itself - to speed this up you can tell the ID type via `ID_type`.
77

8-
The Abstract Retrieval API allows a differing information depth via `views <https://dev.elsevier.com/guides/AbstractRetrievalViews.htm>`_, some of which are restricted. The view 'META_ABS' is the highest unrestricted view and contains all information from other unrestricted views. It is therefore the default view. The view with the most information content is 'FULL', which includes all information available with 'META_ABS', but is restricted. In generally you should always try to use `view='FULL'` when downloading an abstract and fall back to the default otherwise.
8+
The Abstract Retrieval API allows a differing information depth via `views <https://dev.elsevier.com/guides/AbstractRetrievalViews.htm>`_, some of which are restricted. The 'META_ABS' view is the most comprehensive among unrestricted views, encompassing all information from other unrestricted views. It is therefore the default view. The view with the most information content is 'FULL', which includes all information available with 'META_ABS', but is restricted. Generally, you should always try to use `view='FULL'` when downloading an abstract and fall back to the default otherwise.
99

1010
.. currentmodule:: pybliometrics.scopus
1111
.. contents:: Table of Contents
@@ -62,7 +62,7 @@ There are 52 attributes and 8 methods to interact with. For example, to obtain
6262
True
6363
6464
65-
Attributes `idxterms`, `subject_areas` and `authkeywords` (if provided) provide an idea on the content of a document:
65+
The attributes `idxterms`, `subject_areas` and `authkeywords` (if provided) offer insights into the document's content:
6666
6767
.. code-block:: python
6868
@@ -84,7 +84,7 @@ To obtain the total citation count (at the time the abstract was retrieved and c
8484
34
8585
8686
87-
You get the authors as a list of `namedtuples <https://docs.python.org/3/library/collections.html#collections.namedtuple>`_, which pair conveniently with `pandas <https://pandas.pydata.org/>`_:
87+
You can retrieve the authors as a list of `namedtuples <https://docs.python.org/3/library/collections.html#collections.namedtuple>`_, which pair conveniently with `pandas <https://pandas.pydata.org/>`_:
8888
8989
.. code-block:: python
9090
@@ -125,7 +125,7 @@ The same structure applies for the attributes `affiliation` and `authorgroup`:
125125
126126
127127
128-
Keep in mind that Scopus might not perfectly/correctly pair authors and affiliations as per the original document, even if it looks so on the web view. In this case please request corrections to be made in Scopus' API here `here <https://service.elsevier.com/app/contact/supporthub/scopuscontent/>`_.
128+
Note that Scopus may not always accurately pair authors with their affiliations as per the original document, even if it looks so on the web view. In this case please request corrections to be made in Scopus' API here `here <https://service.elsevier.com/app/contact/supporthub/scopuscontent/>`_.
129129
130130
The references of an article (useful to build citation networks) are only
131131
available if you downloaded the article with 'FULL' as `view` parameter.
@@ -166,7 +166,7 @@ available if you downloaded the article with 'FULL' as `view` parameter.
166166
'2-s2.0-84887264733']
167167
168168
169-
Setting `view="REF"` accesses the REF view of the article, which provides more information on the referenced items (but less on other attributes of the document):
169+
Using `view="REF"` accesses the REF view of the article, which provides more information on the referenced items (but less on other attributes of the document):
170170
171171
.. code-block:: python
172172
@@ -186,7 +186,7 @@ Setting `view="REF"` accesses the REF view of the article, which provides more i
186186
187187
The list of authors contains duplicate because of the 1:1 pairing with the authors' affiliation IDs. In above example, 7003962139 is affiliated with 60033272 and with 60015849. Authors are therefore grouped by affiliation ID.
188188
189-
For conference proceedings, Scopus also collects information on the conference:
189+
Scopus also gathers detailed information about conferences for conference proceedings, including:
190190
191191
.. code-block:: python
192192
@@ -256,4 +256,4 @@ You can print the abstract in a variety of formats, including LaTeX, bibtex, HTM
256256
ER -
257257
258258
259-
Downloaded results are cached to speed up subsequent analysis. This information may become outdated. To refresh the cached results if they exist, set `refresh=True`, or provide an integer that will be interpreted as maximum allowed number of days since the last modification date. For example, if you want to refresh all cached results older than 100 days, set `refresh=100`. Use `ab.get_cache_file_mdate()` to get the date of last modification, and `ab.get_cache_file_age()` the number of days since the last modification.
259+
Downloaded results are cached to expedite subsequent analyses. This information may become outdated. To refresh the cached results if they exist, set `refresh=True`, or provide an integer that will be interpreted as maximum allowed number of days since the last modification date. For example, if you want to refresh all cached results older than 100 days, set `refresh=100`. Use `ab.get_cache_file_mdate()` to obtain the date of last modification, and `ab.get_cache_file_age()` to determine the number of days since the last modification.

docs/classes/AffiliationRetrieval.rst

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
pybliometrics.scopus.AffiliationRetrieval
22
=========================================
33

4-
`AffiliationRetrieval()` implements the `Affiliation Retrieval API <https://dev.elsevier.com/documentation/AffiliationRetrievalAPI.wadl>`_. It provides basic information on registered affiliations, like city, country, its members, and more.
4+
`AffiliationRetrieval()` implements the `Affiliation Retrieval API <https://dev.elsevier.com/documentation/AffiliationRetrievalAPI.wadl>`_. It provides basic information on registered affiliations, such as city, country, its members, and more.
55

66
.. currentmodule:: pybliometrics.scopus
77
.. contents:: Table of Contents
@@ -17,7 +17,7 @@ Documentation
1717
Examples
1818
--------
1919

20-
You initialize the class with Scopus' Affiliation ID:
20+
You initialize the class using Scopus' Affiliation ID:
2121

2222
.. code-block:: python
2323
@@ -34,7 +34,7 @@ You can obtain basic information just by printing the object:
3434
has 13,033 associated author(s) and 75,695 associated document(s) as of 2021-07-12
3535
3636
37-
The object has a number of attributes but no methods. For example, information regarding the affiliation itself:
37+
The object has several of attributes but no methods. For example, information regarding the affiliation itself:
3838

3939
.. code-block:: python
4040
@@ -58,7 +58,7 @@ The object has a number of attributes but no methods. For example, information
5858
'http://www.uct.ac.za'
5959
6060
61-
There are meta information, too:
61+
There are meta-information, too:
6262

6363
.. code-block:: python
6464
@@ -79,7 +79,7 @@ Scopus also collects information on different names affiliated authors use for t
7979
Variant(name='Univ. Of Cape Town', doc_count=392)]
8080
8181
82-
Using `pandas <https://pandas.pydata.org/>`_ you can easily turn this into a DataFrame:
82+
Using `pandas <https://pandas.pydata.org/>`_, you can easily convert this into a DataFrame:
8383

8484
.. code-block:: python
8585
@@ -94,4 +94,4 @@ Using `pandas <https://pandas.pydata.org/>`_ you can easily turn this into a Dat
9494
9595
More on different types of affiliations in section `tips <../tips.html#affiliations>`_.
9696

97-
Downloaded results are cached to speed up subsequent analysis. This information may become outdated. To refresh the cached results if they exist, set `refresh=True`, or provide an integer that will be interpreted as maximum allowed number of days since the last modification date. For example, if you want to refresh all cached results older than 100 days, set `refresh=100`. Use `aff.get_cache_file_mdate()` to get the date of last modification, and `aff.get_cache_file_age()` the number of days since the last modification.
97+
Downloaded results are cached to expedite subsequent analyses. This information may become outdated. To refresh the cached results if they exist, set `refresh=True`, or provide an integer that will be interpreted as maximum allowed number of days since the last modification date. For example, if you want to refresh all cached results older than 100 days, set `refresh=100`. Use `ab.get_cache_file_mdate()` to obtain the date of last modification, and `ab.get_cache_file_age()` to determine the number of days since the last modification.

0 commit comments

Comments
 (0)