-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Description
Current State
BacDive data is currently fetched from a Google Drive file (see PR #349). The download_bacdive.py utility script exists (PR #273, thanks @realmarcin) and was updated to use .env credentials (PR #314), but it's not integrated into the standard kg download workflow.
Proposal
Integrate BacDive API fetching into kg download, similar to how MediaDive bulk download works (see _post_download_mediadive_bulk() in download.py).
This would make the build fully reproducible from source for users with BacDive API credentials.
Notes
- The current
download_bacdive.pyscans 200k IDs sequentially (~97k actually exist) - may need optimization - Please correct me if there's existing integration I'm overlooking
- Related: identify python libraries that are necessary for core tasks like
kg downloadbut are not specified inpyproject.toml#333, Is DSMZ "giving us" the bacdive_strains.json file, or are we scraping it with https://pypi.org/project/bacdive/ ? #335
Acceptance Criteria
-
kg downloadcan fetch BacDive data via API (with credentials) - Documentation for credential setup
- Fallback behavior if credentials unavailable
Metadata
Metadata
Assignees
Labels
No labels