Skip to content

Where the score is the same, actives will rank higher than inactives #5

@baoilleach

Description

@baoilleach

In the course of using this benchmark, I just recently noticed a small error, regarding the line:

scores[fp].append(sorted(single_score[fp], reverse=True))

...which occurs in several similarly-named Python scripts.

Since single_score[fp] is a tuple of (simscore, id, active/inactive), it does indeed rank first by similarity, but then it ranks by Id, and the actives have Ids with 'A' in them instead of 'D' for the decoys, and so rank higher (when the similarity is the same). However, even just sorting by the similarity is not sufficient to avoid this problem, as Python sort is a stable sort, and the actives are added to the list first, and so will always occur ahead of the decoys. In other words, a random shuffle is needed first. Here is a potential fix:

# random.seed(1) at the top of the file
random.shuffle(single_score[fp])
scores[fp].append(sorted(single_score[fp], reverse=True, key=lambda x:x[0]))

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions