Skip to content

Support retrieval of scxa anndata files by accession #106

@dosumis

Description

@dosumis

Status draft:

Question: Does this belong in VFB_connect or in a separate library? Probably the latter.

This works

import anndata
import requests
import warnings
def read_h5ad_from_scxa(accession, dir='./'):
    """
    Retrieve anndata file from scxa by accession.  Save file to disc and return an anndata object

    ARGS: 
    * Accession: 
    
    KWARGS: 
    * dir: Optionally specify directory where anndata file should be stored.
    """
    filename = accession + '.project.h5ad'
    r = requests.get("http://ftp.ebi.ac.uk/pub/databases/microarray/data/atlas/sc_experiments/%s/%s.project.h5ad" % (accession, accession))
    if not r.status_code == 200:
        warnings.warn("request failed: " + r.reason)
        return False
    filepath = dir + filename
    with open(filepath, 'wb') as h5ad:
        h5ad.write(r.content)
    return anndata.read_h5ad(filepath)

The result is still quite far from the CxG standard for obs and var e.g.

var['gene_name'] --> feature_name

authors_cell_type_-ontology_labels-_ontology_labels. --> cell_type

authors_cell_type_-_ontology_labels_ontology --> cell_type_ontology_term_id - using CURIEs for values in in place of PURLs

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions