Skip to content

Can training CogAct from PrismaticVLM achieve convergence? #41

@fx-hit

Description

@fx-hit

How long does it take to achieve convergence when starting training from PrismaticVLM?
When loading the OpenVLA checkpoint and training with BridgeV2 for 5k iterations, some success rate is observed.
However, if loading the prism-dinosiglip-224px+7b model—first pretrained on the OpenX dataset with 16 A100 GPUs for 40k iterations, then trained on BridgeV2 for 25k iterations—the success rate remains zero.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions