🔍 Code Extractor

function get_bib

Maturity: 47

Fetches BibTeX citation data for a given DOI (Digital Object Identifier) from the CrossRef API.

File:
/tf/active/vicechatdev/offline_parser_docstore.py
Lines:
31 - 52
Complexity:
simple

Purpose

This function retrieves bibliographic information in BibTeX format for academic papers and publications using their DOI. It queries the CrossRef API's transformation service to convert DOI metadata into a formatted BibTeX citation string. This is useful for automated citation management, bibliography generation, and academic reference systems.

Source Code

def get_bib(doi):
    """
    Parameters
    ----------

        doi: str

    Returns
    -------

        found: bool
        bib: str
    """
    bare_url = "http://api.crossref.org/"
    url = "{}works/{}/transform/application/x-bibtex"
    url = url.format(bare_url, doi)
    r = requests.get(url)
    #found = False if r.status_code != 200 else True
    bib = r.content
    bib = str(bib, "utf-8")

    return bib

Parameters

Name Type Default Kind
doi - - positional_or_keyword

Parameter Details

doi: A string containing the Digital Object Identifier (DOI) for a publication. Should be in standard DOI format (e.g., '10.1000/xyz123'). The DOI can include or exclude the 'doi:' prefix or 'https://doi.org/' URL prefix as the CrossRef API handles various formats.

Return Value

Returns a string containing the BibTeX formatted citation for the publication. The BibTeX string includes standard fields like author, title, journal, year, etc. Note: The docstring incorrectly mentions returning a tuple (found: bool, bib: str), but the actual implementation only returns the bib string. If the DOI is not found or the request fails, the function may return an error message or empty content from the API.

Dependencies

  • requests

Required Imports

import requests

Usage Example

import requests

def get_bib(doi):
    bare_url = "http://api.crossref.org/"
    url = "{}works/{}/transform/application/x-bibtex"
    url = url.format(bare_url, doi)
    r = requests.get(url)
    bib = r.content
    bib = str(bib, "utf-8")
    return bib

# Example usage
doi = "10.1038/nature12373"
bib_citation = get_bib(doi)
print(bib_citation)

# Output will be a BibTeX formatted string like:
# @article{Author_2013,
#   title={Article Title},
#   author={Author, First and Author, Second},
#   journal={Nature},
#   year={2013},
#   ...
# }

Best Practices

  • Add error handling to check r.status_code before processing the response (the commented-out 'found' variable suggests this was intended)
  • Consider adding timeout parameter to requests.get() to prevent hanging on slow connections
  • Validate the DOI format before making the API request to avoid unnecessary network calls
  • Handle potential exceptions from requests.get() (network errors, timeouts, etc.)
  • The function should return both success status and bib content as indicated in the docstring, or update the docstring to match actual behavior
  • Consider using HTTPS instead of HTTP for the API URL for better security
  • Add retry logic for transient network failures
  • Cache results for frequently requested DOIs to reduce API calls

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function get_bibtext 74.7% similar

    Retrieves and parses BibTeX citation data for a given DOI (Digital Object Identifier), extracting the title and formatted BibTeX string.

    From: /tf/active/vicechatdev/offline_parser_docstore.py
  • function parse_references_section 36.4% similar

    Parses a formatted references section string and extracts structured data including reference numbers, sources, and content previews using regular expressions.

    From: /tf/active/vicechatdev/improved_convert_disclosures_to_table.py
  • class ReferenceManager_v1 34.9% similar

    Manages document references for inline citation and bibliography generation, tracking documents and generating formatted citations and bibliographies.

    From: /tf/active/vicechatdev/improved_project_victoria_generator.py
  • function search_documents 31.6% similar

    Searches for documents in a Neo4j graph database based on multiple optional filter criteria including text query, document type, department, status, and owner.

    From: /tf/active/vicechatdev/document_controller_backup.py
  • class ReferenceManager 31.3% similar

    Manages document references for inline citation and bibliography generation in a RAG (Retrieval-Augmented Generation) system.

    From: /tf/active/vicechatdev/fixed_project_victoria_generator.py
← Back to Browse