Skip to content

Fix Model Loading Code Snippet#4

Open
AdamBelfki3 wants to merge 1 commit intoSakanaAI:mainfrom
AdamBelfki3:main
Open

Fix Model Loading Code Snippet#4
AdamBelfki3 wants to merge 1 commit intoSakanaAI:mainfrom
AdamBelfki3:main

Conversation

@AdamBelfki3
Copy link

The code snippet provided to load the model from Huggingface fails because it's using the AutoModel class instead of AutoModelForCausalLM as per the source configuration.

Error

ValueError: Unrecognized configuration class <class 'transformers_modules.SakanaAI.Llama-2-7b-hf-DroPE.ce5e32947bbca27d86f3467f934c5844ec9c2018.drope._create_drope_config_class.<locals>.DroPEConfig'> for this kind of AutoModel: AutoModel.
Model type should be one of Aimv2Config, Aimv2VisionConfig, AlbertConfig, AlignConfig, AltCLIPConfig, ApertusConfig, ArceeConfig, AriaConfig, AriaTextConfig, ASTConfig, AutoformerConfig, AyaVisionConfig, BambaConfig, BarkConfig, BartConfig, BeitConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BitConfig, BitNetConfig, BlenderbotConfig, BlenderbotSmallConfig, BlipConfig, Blip2Config, Blip2QFormerConfig, BloomConfig, BridgeTowerConfig, BrosConfig, CamembertConfig, CanineConfig, ChameleonConfig, ChineseCLIPConfig, ChineseCLIPVisionConfig, ClapConfig, CLIPConfig, CLIPTextConfig, CLIPVisionConfig, CLIPSegConfig, ClvpConfig, LlamaConfig, CodeGenConfig, CohereConfig, Cohere2Config, Cohere2VisionConfig, ConditionalDetrConfig, ConvBertConfig, ConvNextConfig, ConvNextV2Config, CpmAntConfig, CsmConfig, CTRLConfig, CvtConfig, DFineConfig, DabDetrConfig, DacConfig, Data2VecAudioConfig, Data2VecTextConfig, Data2VecVisionConfig, DbrxConfig, DebertaConfig, DebertaV2Config, DecisionTransformerConfig, DeepseekV2Config, DeepseekV3Config, DeepseekVLConfig, DeepseekVLHybridConfig, DeformableDetrConfig, DeiTConfig, DepthProConfig, DetaConfig, DetrConfig, DiaConfig, DiffLlamaConfig, DinatConfig, Dinov2Config, Dinov2WithRegistersConfig, DINOv3ConvNextConfig, DINOv3ViTConfig, DistilBertConfig, DogeConfig, DonutSwinConfig, Dots1Config, DPRConfig, DPTConfig, EfficientFormerConfig, EfficientLoFTRConfig, EfficientNetConfig, ElectraConfig, Emu3Config, EncodecConfig, ErnieConfig, Ernie4_5Config, Ernie4_5_MoeConfig, ErnieMConfig, EsmConfig, EvollaConfig, Exaone4Config, FalconConfig, FalconH1Config, FalconMambaConfig, FastSpeech2ConformerConfig, FastSpeech2ConformerWithHifiGanConfig, FlaubertConfig, FlavaConfig, Florence2Config, FNetConfig, FocalNetConfig, FSMTConfig, FunnelConfig, FuyuConfig, GemmaConfig, Gemma2Config, Gemma3Config, Gemma3TextConfig, Gemma3nConfig, Gemma3nAudioConfig, Gemma3nTextConfig, Gemma3nVisionConfig, GitConfig, GlmConfig, Glm4Config, Glm4MoeConfig, Glm4vConfig, Glm4vMoeConfig, Glm4vMoeTextConfig, Glm4vTextConfig, GLPNConfig, GotOcr2Config, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GptOssConfig, GPTJConfig, GPTSanJapaneseConfig, GraniteConfig, GraniteMoeConfig, GraniteMoeHybridConfig, GraniteMoeSharedConfig, GraphormerConfig, GroundingDinoConfig, GroupViTConfig, HeliumConfig, HGNetV2Config, HieraConfig, HubertConfig, HunYuanDenseV1Config, HunYuanMoEV1Config, IBertConfig, IdeficsConfig, Idefics2Config, Idefics3Config, Idefics3VisionConfig, IJepaConfig, ImageGPTConfig, InformerConfig, InstructBlipConfig, InstructBlipVideoConfig, InternVLConfig, InternVLVisionConfig, JambaConfig, JanusConfig, JetMoeConfig, JukeboxConfig, Kosmos2Config, Kosmos2_5Config, KyutaiSpeechToTextConfig, LayoutLMConfig, LayoutLMv2Config, LayoutLMv3Config, LEDConfig, LevitConfig, Lfm2Config, LightGlueConfig, LiltConfig, LlamaConfig, Llama4Config, Llama4TextConfig, LlavaConfig, LlavaNextConfig, LlavaNextVideoConfig, LlavaOnevisionConfig, LongformerConfig, LongT5Config, LukeConfig, LxmertConfig, M2M100Config, MambaConfig, Mamba2Config, MarianConfig, MarkupLMConfig, Mask2FormerConfig, MaskFormerConfig, MaskFormerSwinConfig, MBartConfig, MCTCTConfig, MegaConfig, MegatronBertConfig, MetaClip2Config, MgpstrConfig, MimiConfig, MiniMaxConfig, MistralConfig, Mistral3Config, MixtralConfig, MLCDVisionConfig, MllamaConfig, MMGroundingDinoConfig, MobileBertConfig, MobileNetV1Config, MobileNetV2Config, MobileViTConfig, MobileViTV2Config, ModernBertConfig, ModernBertDecoderConfig, MoonshineConfig, MoshiConfig, MPNetConfig, MptConfig, MraConfig, MT5Config, MusicgenConfig, MusicgenMelodyConfig, MvpConfig, NatConfig, NemotronConfig, NezhaConfig, NllbMoeConfig, NystromformerConfig, OlmoConfig, Olmo2Config, OlmoeConfig, OmDetTurboConfig, OneFormerConfig, OpenLlamaConfig, OpenAIGPTConfig, OPTConfig, Ovis2Config, Owlv2Config, OwlViTConfig, PaliGemmaConfig, PatchTSMixerConfig, PatchTSTConfig, PegasusConfig, PegasusXConfig, PerceiverConfig, TimmWrapperConfig, PerceptionLMConfig, PersimmonConfig, PhiConfig, Phi3Config, Phi4MultimodalConfig, PhimoeConfig, PixtralVisionConfig, PLBartConfig, PoolFormerConfig, ProphetNetConfig, PvtConfig, PvtV2Config, QDQBertConfig, Qwen2Config, Qwen2_5_VLConfig, Qwen2_5_VLTextConfig, Qwen2AudioEncoderConfig, Qwen2MoeConfig, Qwen2VLConfig, Qwen2VLTextConfig, Qwen3Config, Qwen3MoeConfig, RecurrentGemmaConfig, ReformerConfig, RegNetConfig, RemBertConfig, ResNetConfig, RetriBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, RTDetrConfig, RTDetrV2Config, RwkvConfig, SamConfig, Sam2Config, Sam2HieraDetConfig, Sam2VideoConfig, Sam2VisionConfig, SamHQConfig, SamHQVisionConfig, SamVisionConfig, SeamlessM4TConfig, SeamlessM4Tv2Config, SeedOssConfig, SegformerConfig, SegGptConfig, SEWConfig, SEWDConfig, SiglipConfig, Siglip2Config, SiglipVisionConfig, SmolLM3Config, SmolVLMConfig, SmolVLMVisionConfig, Speech2TextConfig, SpeechT5Config, SplinterConfig, SqueezeBertConfig, StableLmConfig, Starcoder2Config, SwiftFormerConfig, SwinConfig, Swin2SRConfig, Swinv2Config, SwitchTransformersConfig, T5Config, T5GemmaConfig, TableTransformerConfig, TapasConfig, TextNetConfig, TimeSeriesTransformerConfig, TimesFmConfig, TimesformerConfig, TimmBackboneConfig, TimmWrapperConfig, TrajectoryTransformerConfig, TransfoXLConfig, TvltConfig, TvpConfig, UdopConfig, UMT5Config, UniSpeechConfig, UniSpeechSatConfig, UnivNetConfig, VanConfig, VideoLlavaConfig, VideoMAEConfig, ViltConfig, VipLlavaConfig, VisionTextDualEncoderConfig, VisualBertConfig, ViTConfig, ViTHybridConfig, ViTMAEConfig, ViTMSNConfig, VitDetConfig, VitsConfig, VivitConfig, VJEPA2Config, VoxtralConfig, VoxtralEncoderConfig, Wav2Vec2Config, Wav2Vec2BertConfig, Wav2Vec2ConformerConfig, WavLMConfig, WhisperConfig, XCLIPConfig, XcodecConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, xLSTMConfig, XmodConfig, YolosConfig, YosoConfig, ZambaConfig, Zamba2Config.

Correct code

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

tokenizer = AutoTokenizer.from_pretrained('SakanaAI/Llama-2-7b-hf-DroPE', trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained('SakanaAI/Llama-2-7b-hf-DroPE', trust_remote_code=True, torch_dtype=torch.bfloat16)
python: 3.11.14
transformers: 4.56.12

Updated model import for DroPE to use AutoModelForCausalLM.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant