-
Notifications
You must be signed in to change notification settings - Fork 589
feat(pt_expt): implement se_t and se_t_tebd descriptors. #5206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary of ChangesHello @wanghan-iapcm, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly advances the DeepMD-kit's integration with PyTorch by introducing a new exportable backend. This backend enables Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a new pt_expt (PyTorch Exportable) backend, which is a significant and well-executed feature addition. The implementation cleverly wraps the existing array-api-compat-based dpmodel classes with torch.nn.Module wrappers, enabling them for PyTorch's ecosystem. The use of a generic dpmodel_setattr helper and a registry system to automate the conversion of numpy arrays and nested objects is an excellent design choice that enhances maintainability.
The PR also includes a crucial refactoring of EmbeddingNet to support this extensible design and fixes several bugs in other parts of the codebase, such as in thread handling and environment variable processing. The new backend and changes are thoroughly covered by new tests.
My main suggestion for improvement is to address the code duplication in the environment setup files for the pt and pt_expt backends, which could be refactored.
| # SPDX-License-Identifier: LGPL-3.0-or-later | ||
| import logging | ||
| import multiprocessing | ||
| import os | ||
| import sys | ||
|
|
||
| import numpy as np | ||
|
|
||
| from deepmd.common import ( | ||
| VALID_PRECISION, | ||
| ) | ||
| from deepmd.env import ( | ||
| GLOBAL_ENER_FLOAT_PRECISION, | ||
| GLOBAL_NP_FLOAT_PRECISION, | ||
| get_default_nthreads, | ||
| set_default_nthreads, | ||
| ) | ||
|
|
||
| log = logging.getLogger(__name__) | ||
| import torch | ||
|
|
||
| if sys.platform != "win32": | ||
| try: | ||
| multiprocessing.set_start_method("fork", force=True) | ||
| log.debug("Successfully set multiprocessing start method to 'fork'.") | ||
| except (RuntimeError, ValueError) as err: | ||
| log.warning(f"Could not set multiprocessing start method: {err}") | ||
| else: | ||
| log.debug("Skipping fork start method on Windows (not supported).") | ||
|
|
||
| SAMPLER_RECORD = os.environ.get("SAMPLER_RECORD", False) | ||
| DP_DTYPE_PROMOTION_STRICT = os.environ.get("DP_DTYPE_PROMOTION_STRICT", "0") == "1" | ||
| try: | ||
| # only linux | ||
| ncpus = len(os.sched_getaffinity(0)) | ||
| except AttributeError: | ||
| ncpus = os.cpu_count() | ||
| NUM_WORKERS = int(os.environ.get("NUM_WORKERS", min(4, ncpus))) | ||
| if multiprocessing.get_start_method() != "fork": | ||
| # spawn or forkserver does not support NUM_WORKERS > 0 for DataLoader | ||
| log.warning( | ||
| "NUM_WORKERS > 0 is not supported with spawn or forkserver start method. " | ||
| "Setting NUM_WORKERS to 0." | ||
| ) | ||
| NUM_WORKERS = 0 | ||
|
|
||
| # Make sure DDP uses correct device if applicable | ||
| LOCAL_RANK = os.environ.get("LOCAL_RANK") | ||
| LOCAL_RANK = int(0 if LOCAL_RANK is None else LOCAL_RANK) | ||
|
|
||
| if os.environ.get("DEVICE") == "cpu" or torch.cuda.is_available() is False: | ||
| DEVICE = torch.device("cpu") | ||
| else: | ||
| DEVICE = torch.device(f"cuda:{LOCAL_RANK}") | ||
|
|
||
| JIT = False | ||
| CACHE_PER_SYS = 5 # keep at most so many sets per sys in memory | ||
| ENERGY_BIAS_TRAINABLE = True | ||
| CUSTOM_OP_USE_JIT = False | ||
|
|
||
| PRECISION_DICT = { | ||
| "float16": torch.float16, | ||
| "float32": torch.float32, | ||
| "float64": torch.float64, | ||
| "half": torch.float16, | ||
| "single": torch.float32, | ||
| "double": torch.float64, | ||
| "int32": torch.int32, | ||
| "int64": torch.int64, | ||
| "bfloat16": torch.bfloat16, | ||
| "bool": torch.bool, | ||
| } | ||
| GLOBAL_PT_FLOAT_PRECISION = PRECISION_DICT[np.dtype(GLOBAL_NP_FLOAT_PRECISION).name] | ||
| GLOBAL_PT_ENER_FLOAT_PRECISION = PRECISION_DICT[ | ||
| np.dtype(GLOBAL_ENER_FLOAT_PRECISION).name | ||
| ] | ||
| PRECISION_DICT["default"] = GLOBAL_PT_FLOAT_PRECISION | ||
| assert VALID_PRECISION.issubset(PRECISION_DICT.keys()) | ||
| # cannot automatically generated | ||
| RESERVED_PRECISION_DICT = { | ||
| torch.float16: "float16", | ||
| torch.float32: "float32", | ||
| torch.float64: "float64", | ||
| torch.int32: "int32", | ||
| torch.int64: "int64", | ||
| torch.bfloat16: "bfloat16", | ||
| torch.bool: "bool", | ||
| } | ||
| assert set(PRECISION_DICT.values()) == set(RESERVED_PRECISION_DICT.keys()) | ||
| DEFAULT_PRECISION = "float64" | ||
|
|
||
| # throw warnings if threads not set | ||
| set_default_nthreads() | ||
| intra_nthreads, inter_nthreads = get_default_nthreads() | ||
| if inter_nthreads > 0: # the behavior of 0 is not documented | ||
| # torch.set_num_interop_threads can only be called once per process. | ||
| # Guard to avoid RuntimeError when both pt and pt_expt env modules are imported. | ||
| try: | ||
| if torch.get_num_interop_threads() != inter_nthreads: | ||
| torch.set_num_interop_threads(inter_nthreads) | ||
| except RuntimeError as err: | ||
| log.warning(f"Could not set torch interop threads: {err}") | ||
| if intra_nthreads > 0: | ||
| # torch.set_num_threads can also fail if called after threads are created. | ||
| try: | ||
| if torch.get_num_threads() != intra_nthreads: | ||
| torch.set_num_threads(intra_nthreads) | ||
| except RuntimeError as err: | ||
| log.warning(f"Could not set torch intra threads: {err}") | ||
|
|
||
| __all__ = [ | ||
| "CACHE_PER_SYS", | ||
| "CUSTOM_OP_USE_JIT", | ||
| "DEFAULT_PRECISION", | ||
| "DEVICE", | ||
| "ENERGY_BIAS_TRAINABLE", | ||
| "GLOBAL_ENER_FLOAT_PRECISION", | ||
| "GLOBAL_NP_FLOAT_PRECISION", | ||
| "GLOBAL_PT_ENER_FLOAT_PRECISION", | ||
| "GLOBAL_PT_FLOAT_PRECISION", | ||
| "JIT", | ||
| "LOCAL_RANK", | ||
| "NUM_WORKERS", | ||
| "PRECISION_DICT", | ||
| "RESERVED_PRECISION_DICT", | ||
| "SAMPLER_RECORD", | ||
| ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is almost an identical copy of deepmd/pt/utils/env.py. This significant code duplication can lead to maintenance issues, where a change in one file might not be propagated to the other.
Consider refactoring the common environment settings (device selection, precision dictionaries, thread management, etc.) into a shared module, for example, deepmd.torch_utils.env, which both pt and pt_expt backends can import and use. The backend-specific parts, if any, can then be handled within their respective env.py files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 621c7ccbbc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| class PyTorchExportableBackend(Backend): | ||
| """PyTorch exportable backend.""" | ||
|
|
||
| name = "PyTorch Exportable" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename pt-expt backend to a key-compatible identifier
In deepmd/main.py, backend CLI options are normalized to backend.name.lower() and then validated against registered backend keys; with name = "PyTorch Exportable", --pt-expt/--backend pytorch-exportable resolve to "pytorch exportable", which is not a registered key (pt-expt/pytorch-exportable), so selecting this backend via CLI fails with Unknown backend before any command runs.
Useful? React with 👍 / 👎.
| features: ClassVar[Backend.Feature] = ( | ||
| Backend.Feature.ENTRY_POINT | ||
| | Backend.Feature.DEEP_EVAL | ||
| | Backend.Feature.NEIGHBOR_STAT | ||
| | Backend.Feature.IO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unsupported feature flags from pt-expt backend
This backend advertises DEEP_EVAL, NEIGHBOR_STAT, and IO, so .pte files pass feature/suffix filtering in common workflows (for example dp test via format_model_suffix and backend detection), but the corresponding hooks (deep_eval, neighbor_stat, serialize_hook, deserialize_hook) still raise NotImplementedError, which causes accepted commands on .pte models to fail at runtime.
Useful? React with 👍 / 👎.
📝 WalkthroughWalkthroughThis PR refactors EmbeddingNet from a factory-generated to a concrete class with explicit serialization, and introduces a comprehensive registry-driven system for converting dpmodel instances to PyTorch wrappers (pt_expt). New descriptor, network, and exclude mask wrappers integrate with the registry for seamless dpmodel-to-torch module conversion. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
No actionable comments were generated in the recent review. 🎉 🧹 Recent nitpick comments
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🤖 Fix all issues with AI agents
In `@deepmd/backend/pt_expt.py`:
- Around line 37-42: The class-level features tuple declares
Backend.Feature.DEEP_EVAL, NEIGHBOR_STAT, and IO but the corresponding members
deep_eval, neighbor_stat, serialize_hook and deserialize_hook raise
NotImplementedError; fix this by either removing those unsupported flags from
the features tuple in the class (update the features variable to only include
ENTRY_POINT and any truly supported flags) or keep the flags but add a clear
TODO/roadmap comment above the features declaration referencing deep_eval,
neighbor_stat, serialize_hook and deserialize_hook and a target milestone for
implementing them so callers won’t be misled; adjust the features tuple or add
the comment in pt_expt.py accordingly.
In `@deepmd/dpmodel/utils/network.py`:
- Around line 876-896: The current EmbeddingNet.deserialize assumes obj.layers
is non-empty by doing layer_type = type(obj.layers[0]); add a guard to handle
empty obj.layers: if obj.layers is non-empty keep the existing behavior (use
layer_type = type(obj.layers[0]) and deserialize each serialized layer with
layer_type.deserialize), otherwise fall back to per-dict deserialization by
resolving each serialized layer's class (e.g. via the layer dict's "@class"
entry or a LayerBase.deserialize factory) and create obj.layers from those
deserialized instances; update the code in deserialize to check if not
obj.layers before computing layer_type and choose the appropriate
deserialization path.
In `@deepmd/pt_expt/utils/type_embed.py`:
- Around line 14-15: Remove the redundant noqa directive from the side-effect
import: locate the import statement that brings in deepmd.pt_expt.utils.network
(the comment above it explains it's to register EmbeddingNet) and delete the
trailing "# noqa: F401" so the import remains but the unnecessary noqa is
removed; ensure the import line still stays in place and no other changes are
made to the comment or import name.
In `@source/tests/consistent/descriptor/test_se_r.py`:
- Line 30: The fallback variable names are inconsistent: when INSTALLED_PT is
False the diff sets DescrptSeAPT and DescrptSeATF to None but other code (e.g.,
pt_class) expects DescrptSeRPT and DescrptSeRTF; update the fallback assignments
so the variables referenced elsewhere are defined (set DescrptSeRPT = None and
DescrptSeRTF = None when INSTALLED_PT is False), or alternatively rename usages
in pt_class to match the existing fallback names, ensuring DescrptSeRPT and
DescrptSeRTF exist to avoid a NameError.
In `@source/tests/pt_expt/descriptor/test_se_t_tebd.py`:
- Around line 149-159: The test constructs an inputs tuple that places
type_embedding as the 4th positional argument, which binds to forward's
extended_atype_embd (signature: forward(self, extended_coord, extended_atype,
nlist, extended_atype_embd=None, mapping=None, type_embedding=None)), causing
type_embedding to be exported as None; update the call that exports dd0 so
type_embedding is passed into the correct parameter—either insert explicit None
placeholders for extended_atype_embd and mapping before type_embedding in the
inputs tuple, or switch to exporting with keyword args (e.g., pass
type_embedding=...) when calling torch.export.export(dd0, ...), ensuring
dd0.forward receives type_embedding correctly.
In `@source/tests/pt_expt/utils/test_network.py`:
- Around line 162-164: The test uses an unseeded RNG (rng =
np.random.default_rng()) which makes the test non-deterministic; change the RNG
initialization to use the shared GLOBAL_SEED (e.g., rng =
np.random.default_rng(GLOBAL_SEED)) so x_np is reproducible, ensure GLOBAL_SEED
is imported/available in this test module, and keep the subsequent x_np =
rng.standard_normal((5, self.in_dim)) call unchanged.
🧹 Nitpick comments (10)
deepmd/pt_expt/utils/env.py (2)
31-31:SAMPLER_RECORDmay unintentionally be truthy for string values like"0"or"false".
os.environ.get("SAMPLER_RECORD", False)returns a string when the variable is set, soSAMPLER_RECORD="0"would be truthy. If this matches the existing pattern indeepmd/pt/utils/env.py, it's fine to keep consistent, but worth noting.
1-127: Significant code duplication withdeepmd/pt/utils/env.py.This file appears to be largely a copy of the PT backend's env module. While this may be intentional to keep pt_expt independent, you might consider extracting the shared logic (precision dicts, device detection, thread config) into a common base module to reduce maintenance burden. Both modules would then only override backend-specific constants (e.g.,
JIT,CUSTOM_OP_USE_JIT).deepmd/dpmodel/utils/network.py (1)
816-826: Mutable default argument forneuron.Ruff B006 flags
neuron: list[int] = [24, 48, 96]. While it's safe here (the list is never mutated in-place), this is a well-known Python footgun. That said, the same pattern exists in the factory function on line 714 and is consistent across the codebase, so changing it now might be out of scope.Optional fix
def __init__( self, in_dim: int, - neuron: list[int] = [24, 48, 96], + neuron: list[int] | None = None, activation_function: str = "tanh", resnet_dt: bool = False, precision: str = DEFAULT_PRECISION, seed: int | list[int] | None = None, bias: bool = True, trainable: bool | list[bool] = True, ) -> None: + if neuron is None: + neuron = [24, 48, 96] layers = []source/tests/consistent/common.py (1)
70-79: Pre-existing: duplicate"CommonTest"in__all__.Line 78 duplicates line 77. Not introduced by this PR, but worth noting.
deepmd/pt_expt/utils/network.py (1)
27-37: Remove unusednoqadirective.Ruff correctly flags
noqa: PYI034as unnecessary here since PYI034 only applies to.pyistub files.🧹 Proposed fix
- def __new__( # noqa: PYI034 + def __new__( cls, data: Any = None, requires_grad: bool = True ) -> "TorchArrayParam":source/tests/pt_expt/utils/test_common.py (1)
14-23: Consider adding a test forNoneinput.Per the
to_torch_arraydocstring, passingNoneshould returnNone. This edge case isn't covered.Suggested addition
def test_to_torch_array_none() -> None: assert to_torch_array(None) is Nonedeepmd/pt_expt/descriptor/__init__.py (1)
3-3: Remove unusednoqadirective per static analysis.Ruff (RUF100) reports the
# noqa: F401directive is unnecessary here since the F401 rule wouldn't trigger for this import.Proposed fix
-from . import se_t_tebd_block # noqa: F401 +from . import se_t_tebd_block # register converterssource/tests/pt_expt/descriptor/test_se_t_tebd.py (1)
99-100: Track or resolve the TODO aboutgrbeingNone.The comment
# TODO: investigate why gr is Nonesuggests an unresolved question about the descriptor's output. Ifgris expected to be non-None forse_t_tebd, this could indicate a bug in either the pt_expt or dpmodel implementation.Would you like me to open an issue to track this, or help investigate the
grreturn path?source/tests/pt_expt/descriptor/test_se_t.py (2)
42-44: Consider testingexclude_typescombinations for completeness.Unlike
test_se_r.pyandtest_se_e2_a.py(which iterate overexclude_typesvariations like[[], [[0, 1]], [[1, 1]]]), this test only variesresnet_dtandprecision. The consistent test (test_se_t.py) does parameterize overexcluded_types, so it may be worth exercising that path here too (especially thereinit_excludecodepath tested intest_se_e2_a.py).
65-65: Unused variablesgr1andgr2flagged by Ruff.Per the comment on line 90,
se_treturnsNoneforgr/g2/h2, so these are intentionally not compared. Prefix with_to suppress the lint warning.- rd1, gr1, _, _, sw1 = dd1( + rd1, _gr1, _, _, sw1 = dd1( ... - rd2, gr2, _, _, sw2 = dd2.call( + rd2, _gr2, _, _, sw2 = dd2.call(Also applies to: 85-85
| features: ClassVar[Backend.Feature] = ( | ||
| Backend.Feature.ENTRY_POINT | ||
| | Backend.Feature.DEEP_EVAL | ||
| | Backend.Feature.NEIGHBOR_STAT | ||
| | Backend.Feature.IO | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feature flags declare capabilities that raise NotImplementedError.
DEEP_EVAL, NEIGHBOR_STAT, and IO are declared as supported features, but their corresponding properties (deep_eval, neighbor_stat, serialize_hook, deserialize_hook) raise NotImplementedError. If a caller checks backend.features & Feature.DEEP_EVAL and then accesses backend.deep_eval, it'll get an unexpected error. Consider either removing the feature flags until implemented, or adding a brief comment explaining the roadmap.
🤖 Prompt for AI Agents
In `@deepmd/backend/pt_expt.py` around lines 37 - 42, The class-level features
tuple declares Backend.Feature.DEEP_EVAL, NEIGHBOR_STAT, and IO but the
corresponding members deep_eval, neighbor_stat, serialize_hook and
deserialize_hook raise NotImplementedError; fix this by either removing those
unsupported flags from the features tuple in the class (update the features
variable to only include ENTRY_POINT and any truly supported flags) or keep the
flags but add a clear TODO/roadmap comment above the features declaration
referencing deep_eval, neighbor_stat, serialize_hook and deserialize_hook and a
target milestone for implementing them so callers won’t be misled; adjust the
features tuple or add the comment in pt_expt.py accordingly.
| @classmethod | ||
| def deserialize(cls, data: dict) -> "EmbeddingNet": | ||
| """Deserialize the network from a dict. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| data : dict | ||
| The dict to deserialize from. | ||
| """ | ||
| data = data.copy() | ||
| check_version_compatibility(data.pop("@version", 1), 2, 1) | ||
| data.pop("@class", None) | ||
| layers = data.pop("layers") | ||
| obj = cls(**data) | ||
| # Reinitialize layers from serialized data, using the same layer type | ||
| # that __init__ created (respects subclass overrides via MRO). | ||
| layer_type = type(obj.layers[0]) | ||
| obj.layers = type(obj.layers)( | ||
| [layer_type.deserialize(layer) for layer in layers] | ||
| ) | ||
| return obj |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deserialize assumes obj.layers is non-empty (Line 892).
layer_type = type(obj.layers[0]) will raise IndexError if neuron is an empty list. While this is unlikely in practice, the factory-based deserialize (line 784–786) didn't have this assumption since it called super(EN, obj).__init__(layers) directly.
If neuron=[] is never valid, this is fine as-is. Otherwise, a guard would prevent a confusing traceback.
Defensive fix
obj = cls(**data)
# Reinitialize layers from serialized data, using the same layer type
# that __init__ created (respects subclass overrides via MRO).
- layer_type = type(obj.layers[0])
- obj.layers = type(obj.layers)(
- [layer_type.deserialize(layer) for layer in layers]
- )
+ if obj.layers:
+ layer_type = type(obj.layers[0])
+ else:
+ layer_type = NativeLayer
+ obj.layers = type(obj.layers)(
+ [layer_type.deserialize(layer) for layer in layers]
+ )🤖 Prompt for AI Agents
In `@deepmd/dpmodel/utils/network.py` around lines 876 - 896, The current
EmbeddingNet.deserialize assumes obj.layers is non-empty by doing layer_type =
type(obj.layers[0]); add a guard to handle empty obj.layers: if obj.layers is
non-empty keep the existing behavior (use layer_type = type(obj.layers[0]) and
deserialize each serialized layer with layer_type.deserialize), otherwise fall
back to per-dict deserialization by resolving each serialized layer's class
(e.g. via the layer dict's "@class" entry or a LayerBase.deserialize factory)
and create obj.layers from those deserialized instances; update the code in
deserialize to check if not obj.layers before computing layer_type and choose
the appropriate deserialization path.
deepmd/pt_expt/utils/type_embed.py
Outdated
| # Import network to ensure EmbeddingNet is registered before TypeEmbedNet is used | ||
| from deepmd.pt_expt.utils import network # noqa: F401 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# Check linter configuration
find . -maxdepth 2 \( -name "pyproject.toml" -o -name "setup.cfg" -o -name ".ruff.toml" -o -name "ruff.toml" \) -type fRepository: deepmodeling/deepmd-kit
Length of output: 95
🏁 Script executed:
# Read the actual file to understand context
cat -n deepmd/pt_expt/utils/type_embed.pyRepository: deepmodeling/deepmd-kit
Length of output: 1679
🏁 Script executed:
# Check if F401 is mentioned in ruff or flake8 configs
rg -i "F401|extend-select|extend-ignore" --type toml --type cfg --type iniRepository: deepmodeling/deepmd-kit
Length of output: 94
🏁 Script executed:
cat -n pyproject.toml | head -100Repository: deepmodeling/deepmd-kit
Length of output: 3936
🏁 Script executed:
cat -n ruff.tomlRepository: deepmodeling/deepmd-kit
Length of output: 1183
🏁 Script executed:
# Check if F401 is explicitly mentioned in the configs
rg "F401" pyproject.toml ruff.tomlRepository: deepmodeling/deepmd-kit
Length of output: 190
Remove the unnecessary noqa: F401 directive since F401 is globally ignored in the project's Ruff configuration.
The side-effect import on line 15 is appropriate and well-documented. However, the project's ruff.toml explicitly ignores F401 globally, making the noqa: F401 comment unnecessary. This will also prevent RUF100 warnings about unused noqa directives.
Remove # noqa: F401 from line 15.
🧰 Tools
🪛 Ruff (0.14.14)
[warning] 15-15: Unused noqa directive (non-enabled: F401)
Remove unused noqa directive
(RUF100)
🤖 Prompt for AI Agents
In `@deepmd/pt_expt/utils/type_embed.py` around lines 14 - 15, Remove the
redundant noqa directive from the side-effect import: locate the import
statement that brings in deepmd.pt_expt.utils.network (the comment above it
explains it's to register EmbeddingNet) and delete the trailing "# noqa: F401"
so the import remains but the unnecessary noqa is removed; ensure the import
line still stays in place and no other changes are made to the comment or import
name.
| if INSTALLED_PT: | ||
| from deepmd.pt.model.descriptor.se_r import DescrptSeR as DescrptSeRPT | ||
| else: | ||
| DescrptSeAPT = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pre-existing bug: fallback variable name mismatch.
When INSTALLED_PT is False, line 30 sets DescrptSeAPT = None but pt_class on line 134 references DescrptSeRPT. If PT is not installed, this would cause a NameError at class definition time. The same issue exists on line 38 with DescrptSeATF vs DescrptSeRTF. This predates the current PR but is worth noting.
Suggested fix (pre-existing)
if INSTALLED_PT:
from deepmd.pt.model.descriptor.se_r import DescrptSeR as DescrptSeRPT
else:
- DescrptSeAPT = None
+ DescrptSeRPT = None if INSTALLED_TF:
from deepmd.tf.descriptor.se_r import DescrptSeR as DescrptSeRTF
else:
- DescrptSeATF = None
+ DescrptSeRTF = None📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| DescrptSeAPT = None | |
| DescrptSeRPT = None |
🤖 Prompt for AI Agents
In `@source/tests/consistent/descriptor/test_se_r.py` at line 30, The fallback
variable names are inconsistent: when INSTALLED_PT is False the diff sets
DescrptSeAPT and DescrptSeATF to None but other code (e.g., pt_class) expects
DescrptSeRPT and DescrptSeRTF; update the fallback assignments so the variables
referenced elsewhere are defined (set DescrptSeRPT = None and DescrptSeRTF =
None when INSTALLED_PT is False), or alternatively rename usages in pt_class to
match the existing fallback names, ensuring DescrptSeRPT and DescrptSeRTF exist
to avoid a NameError.
| type_embedding = torch.randn( | ||
| [self.nt, dd0.tebd_dim], dtype=dtype, device=self.device | ||
| ) | ||
|
|
||
| inputs = ( | ||
| torch.tensor(self.coord_ext, dtype=dtype, device=self.device), | ||
| torch.tensor(self.atype_ext, dtype=int, device=self.device), | ||
| torch.tensor(self.nlist, dtype=int, device=self.device), | ||
| type_embedding, | ||
| ) | ||
| torch.export.export(dd0, inputs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify the forward() signature of DescrptSeTTebd
echo "=== forward() signature and implementation ==="
fd 'se_t_tebd.py' --type f | grep -v test | head -5Repository: deepmodeling/deepmd-kit
Length of output: 306
🏁 Script executed:
#!/bin/bash
# First, locate the se_t_tebd.py file (not the test)
find . -name 'se_t_tebd.py' -not -path '*/tests/*' | head -1Repository: deepmodeling/deepmd-kit
Length of output: 107
🏁 Script executed:
#!/bin/bash
# List structure to understand where files are
ls -la source/pt_expt/ 2>/dev/null || ls -la source/ 2>/dev/null | head -20Repository: deepmodeling/deepmd-kit
Length of output: 1165
🏁 Script executed:
#!/bin/bash
# Check the test file to see the exact code at lines 149-159
wc -l deepmd/tests/pt_expt/descriptor/test_se_t_tebd.pyRepository: deepmodeling/deepmd-kit
Length of output: 147
🏁 Script executed:
#!/bin/bash
# Read the test file around lines 149-159
sed -n '140,165p' deepmd/tests/pt_expt/descriptor/test_se_t_tebd.pyRepository: deepmodeling/deepmd-kit
Length of output: 159
🏁 Script executed:
#!/bin/bash
# Get forward signature from the descriptor
grep -n "def forward" deepmd/pt_expt/descriptor/se_t_tebd.py -A 10Repository: deepmodeling/deepmd-kit
Length of output: 456
🏁 Script executed:
#!/bin/bash
# Check the full forward method to see parameter handling
sed -n '/def forward/,/^ def /p' deepmd/pt_expt/descriptor/se_t_tebd.py | head -50Repository: deepmodeling/deepmd-kit
Length of output: 779
🏁 Script executed:
#!/bin/bash
# Find the test file
find . -name 'test_se_t_tebd.py' -type fRepository: deepmodeling/deepmd-kit
Length of output: 173
🏁 Script executed:
#!/bin/bash
# List all test files in pt_expt descriptor
find . -path '*pt_expt*descriptor*test*.py' -type fRepository: deepmodeling/deepmd-kit
Length of output: 262
🏁 Script executed:
#!/bin/bash
# Read the test file around lines 149-159
wc -l ./source/tests/pt_expt/descriptor/test_se_t_tebd.pyRepository: deepmodeling/deepmd-kit
Length of output: 122
🏁 Script executed:
#!/bin/bash
# Read around lines 149-159
sed -n '140,165p' ./source/tests/pt_expt/descriptor/test_se_t_tebd.pyRepository: deepmodeling/deepmd-kit
Length of output: 871
🏁 Script executed:
#!/bin/bash
# Get context - find the test_exportable method
grep -n "def test_exportable" ./source/tests/pt_expt/descriptor/test_se_t_tebd.py -A 30Repository: deepmodeling/deepmd-kit
Length of output: 1288
🏁 Script executed:
#!/bin/bash
# Get more lines to see the complete inputs tuple
sed -n '153,162p' ./source/tests/pt_expt/descriptor/test_se_t_tebd.pyRepository: deepmodeling/deepmd-kit
Length of output: 409
🏁 Script executed:
#!/bin/bash
# Also check if there's any usage of type_embedding elsewhere in the forward or call method
grep -n "type_embedding" deepmd/pt_expt/descriptor/se_t_tebd.py -C 3Repository: deepmodeling/deepmd-kit
Length of output: 589
Positional argument mismatch: type_embedding is bound to extended_atype_embd parameter.
In test_exportable, the inputs tuple passes type_embedding as the 4th positional argument. The forward signature is:
def forward(self, extended_coord, extended_atype, nlist, extended_atype_embd=None, mapping=None, type_embedding=None)So the 4th positional arg binds to extended_atype_embd, not type_embedding. Since forward deletes all optional params (del extended_atype_embd, mapping, type_embedding), this doesn't cause a runtime error — but the test silently exports with type_embedding=None, which may not be the intended export behavior.
If the intent is to export with type_embedding, pass None placeholders for the intervening parameters or use kwargs with torch.export.export.
Proposed fix (if type_embedding should be passed)
inputs = (
torch.tensor(self.coord_ext, dtype=dtype, device=self.device),
torch.tensor(self.atype_ext, dtype=int, device=self.device),
torch.tensor(self.nlist, dtype=int, device=self.device),
+ None, # extended_atype_embd
+ None, # mapping
type_embedding,
)🤖 Prompt for AI Agents
In `@source/tests/pt_expt/descriptor/test_se_t_tebd.py` around lines 149 - 159,
The test constructs an inputs tuple that places type_embedding as the 4th
positional argument, which binds to forward's extended_atype_embd (signature:
forward(self, extended_coord, extended_atype, nlist, extended_atype_embd=None,
mapping=None, type_embedding=None)), causing type_embedding to be exported as
None; update the call that exports dd0 so type_embedding is passed into the
correct parameter—either insert explicit None placeholders for
extended_atype_embd and mapping before type_embedding in the inputs tuple, or
switch to exporting with keyword args (e.g., pass type_embedding=...) when
calling torch.export.export(dd0, ...), ensuring dd0.forward receives
type_embedding correctly.
| # Test forward pass | ||
| rng = np.random.default_rng() | ||
| x_np = rng.standard_normal((5, self.in_dim)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-deterministic RNG may cause flaky test failures.
np.random.default_rng() is called without a seed, making this test non-reproducible. Every other test in this file uses GLOBAL_SEED. If a numerical tolerance mismatch occurs, it won't be reproducible.
Proposed fix
- rng = np.random.default_rng()
+ rng = np.random.default_rng(GLOBAL_SEED)🤖 Prompt for AI Agents
In `@source/tests/pt_expt/utils/test_network.py` around lines 162 - 164, The test
uses an unseeded RNG (rng = np.random.default_rng()) which makes the test
non-deterministic; change the RNG initialization to use the shared GLOBAL_SEED
(e.g., rng = np.random.default_rng(GLOBAL_SEED)) so x_np is reproducible, ensure
GLOBAL_SEED is imported/available in this test module, and keep the subsequent
x_np = rng.standard_normal((5, self.in_dim)) call unchanged.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5206 +/- ##
==========================================
- Coverage 81.99% 81.87% -0.12%
==========================================
Files 724 724
Lines 73807 73828 +21
Branches 3616 3615 -1
==========================================
- Hits 60519 60448 -71
- Misses 12124 12217 +93
+ Partials 1164 1163 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This PR is considered after #5194 #5204 and #5205
Summary by CodeRabbit
Release Notes
New Features
Improvements
Tests