Perf: accelerate feature_to_block with torch_scatter #302

AsymmetryChou · 2025-12-26T04:48:29Z

Introduce methods to pre-compute scatter indices for node and edge features, enabling efficient conversion to block matrices.
This enhancement improves performance by leveraging vectorized operations.

Summary by CodeRabbit

Release Notes

New Features
- Added utility methods for enhanced feature-to-block index mapping capabilities.
Performance
- Optimized computation through vectorized operations and device-aware preprocessing for improved efficiency.
Tests
- Added comprehensive test suite verifying consistency, performance gains, and data integrity across conversions.

coderabbitai · 2025-12-26T04:56:14Z

📝 Walkthrough

Walkthrough

The pull request refactors the feature_to_block function to use vectorized scatter operations instead of Python loops. Two new public methods are added to OrbitalMapper to precompute index mappings for node and edge features. Comprehensive tests are introduced to validate consistency, performance, and round-trip conversion against a reference implementation.

Changes

Cohort / File(s)	Summary
Vectorized feature-to-block conversion `dptb/data/interfaces/ham_to_feature.py`	Replaced per-atom and per-edge loops with scatter-based construction using precomputed index mappings; removed `anglrMId` import; added device-aware preprocessing with caching of atom symbols and index arrays; reworked node and edge block assembly using scatter operations with appropriate indexing and transposition handling.
Index mapping utilities `dptb/data/transforms.py`	Added `get_node_feature_to_block_indices()` and `get_edge_feature_to_block_indices()` methods to `OrbitalMapper` class; these precompute and cache mapping indices (src, dst, scale factors) for vectorized scatter operations on node and edge features.
Test suite and reference implementation `dptb/tests/test_block_to_feature.py`	Introduced `feature_to_block_reference()` as an unoptimized reference implementation and `TestFeatureToBlockConsistency` test class with three tests: consistency validation against reference, performance comparison, and round-trip conversion verification; includes test scaffolding for dataset and Hamiltonian setup.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 77.78% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and concisely summarizes the main objective: accelerating the feature_to_block function using torch_scatter vectorized operations.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

dptb/data/interfaces/ham_to_feature.py (1)
420-437: Critical bug in self-loop block storage: wrong index logic in atom_i == atom_j case.

Lines 426-431 have inverted logic. When processing a self-loop edge (i, i, R_shift), the code checks if the reverse key (i, i, -R_shift) exists in blocks, then conditionally stores or accumulates—but always at the wrong index:

If r_index doesn't exist: stores block at blocks[block_index] ✓

If r_index exists: stores at blocks[r_index] instead of blocks[block_index] ✗

This means the block for (i, i, R_shift) is never properly stored; it either overwrites or gets incorrectly accumulated into the opposite R_shift entry. For Hermitian systems with periodic boundaries, self-loop blocks with different shifts corrupt each other.

Fix: Treat self-loops like other edges—always store/accumulate at the current edge's block_index:
elif atom_i == atom_j:
    if blocks.get(block_index, None) is None:
        blocks[block_index] = block
    else:
        blocks[block_index] += block

🧹 Nitpick comments (4)

dptb/data/transforms.py (2)
848-905: Consider specifying device for precomputed index tensors.

The precomputed index tensors are created on CPU by default. When used in feature_to_block, they're moved to the target device on every call (lines 373-377 in ham_to_feature.py). For better performance, consider either:

Creating these tensors on self.device during precomputation, or

Caching the device-specific versions after first use
Option 1: Create on self.device during precomputation
         self._node_feature_to_block_indices[symbol] = {
-            'src': torch.tensor(src_indices, dtype=torch.long),
-            'dst': torch.tensor(dst_indices, dtype=torch.long),
-            'dst_T': torch.tensor(dst_indices_T, dtype=torch.long),
-            'is_diag': torch.tensor(is_diag, dtype=torch.bool),
+            'src': torch.tensor(src_indices, dtype=torch.long, device=self.device),
+            'dst': torch.tensor(dst_indices, dtype=torch.long, device=self.device),
+            'dst_T': torch.tensor(dst_indices_T, dtype=torch.long, device=self.device),
+            'is_diag': torch.tensor(is_diag, dtype=torch.bool, device=self.device),
             'norb': norb
         }
907-969: Consider specifying device for precomputed index tensors.

Similar to the node indices, the edge index tensors are created on CPU by default and moved to device on every call. Consider creating them on self.device during precomputation for better performance.
Proposed fix
         self._edge_feature_to_block_indices[bond_type] = {
-            'src': torch.tensor(src_indices, dtype=torch.long),
-            'dst': torch.tensor(dst_indices, dtype=torch.long),
-            'scale': torch.tensor(scale_factors, dtype=torch.float32),
+            'src': torch.tensor(src_indices, dtype=torch.long, device=self.device),
+            'dst': torch.tensor(dst_indices, dtype=torch.long, device=self.device),
+            'scale': torch.tensor(scale_factors, dtype=torch.float32, device=self.device),
             'norb_i': norb_i,
             'norb_j': norb_j
         }
dptb/data/interfaces/ham_to_feature.py (2)
362-365: Optimize atom symbol lookup to avoid repeated single-element untransform calls.

The current implementation calls idp.untransform() for each atom individually within a list comprehension. Since untransform supports batch operations, you can compute all symbols in one call and then convert to chemical symbols.
Proposed optimization
     # Pre-compute atom symbols for all atoms (vectorized lookup)
     atom_types = data[_keys.ATOM_TYPE_KEY]
+    atomic_numbers = idp.untransform(atom_types)
     atom_symbols = [
-        ase.data.chemical_symbols[idp.untransform(atom_types[i].reshape(-1))]
+        ase.data.chemical_symbols[int(atomic_numbers[i])]
         for i in range(len(atom_types))
     ]
373-377: Index device movement happens on every call.

The precomputed indices are moved to the target device on every feature_to_block call for each symbol/bond type. While the comment at line 372 says "only on first use per symbol," the implementation doesn't cache device-specific versions. This could impact performance when processing multiple batches.

Consider implementing device-specific caching or creating indices on the target device during precomputation (as suggested in the transforms.py review).

Also applies to: 408-412

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 63b9ccd and 904a27d.

📒 Files selected for processing (2)

dptb/data/interfaces/ham_to_feature.py
dptb/data/transforms.py

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build

🔇 Additional comments (5)

dptb/data/transforms.py (2)

845-846: LGTM!

The blank line improves readability by visually separating method definitions.

947-959: Clarify the scale factor logic and comment inconsistency.

The comment on line 929 states "0.5 for diagonal pairs," but line 947 checks is_same_basis (whether the basis pair is identical), not whether an element is on the matrix diagonal. These are not equivalent. Additionally, the edge version uses scale factors to handle symmetry, whereas the node version uses explicit transposed indices—the rationale for this design choice and the specific 0.5 factor for same basis pairs should be documented.

dptb/data/interfaces/ham_to_feature.py (3)

9-9: LGTM!

Correctly removed unused anglrMId import after refactoring to scatter-based operations. The angular momentum handling is now encapsulated in the precomputed index methods.

323-335: LGTM!

Clear and comprehensive docstring that explains the vectorized approach and performance benefits.

391-391: No issues found. The block_index format is consistent with the auto-detection logic in block_to_feature. Line 391 writes 0-indexed blocks ([atom, atom, 0, 0, 0]), and the start_id logic at lines 50–58 correctly detects this format and retrieves blocks with matching indices.

Copilot

Pull request overview

This PR optimizes the feature_to_block function by introducing pre-computed scatter indices for vectorized operations. The optimization replaces nested Python loops with efficient PyTorch scatter operations, improving performance when converting feature vectors to Hamiltonian/overlap block matrices.

Key changes:

Added two new caching methods (get_node_feature_to_block_indices and get_edge_feature_to_block_indices) to pre-compute scatter index mappings
Refactored feature_to_block to use vectorized scatter operations instead of nested loops
Removed unused anglrMId import from ham_to_feature.py

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File	Description
dptb/data/transforms.py	Added `get_node_feature_to_block_indices` and `get_edge_feature_to_block_indices` methods to pre-compute and cache scatter indices for vectorized block matrix construction
dptb/data/interfaces/ham_to_feature.py	Refactored `feature_to_block` function to use pre-computed scatter indices with vectorized operations; removed unused import; added comprehensive docstring

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

dptb/data/interfaces/ham_to_feature.py

dptb/data/transforms.py

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@dptb/data/interfaces/ham_to_feature.py`:
- Around line 360-365: The loop building atom_symbols is indexing
ase.data.chemical_symbols with a tensor returned from idp.untransform(), causing
a TypeError; fix it by converting the tensor to a Python integer scalar before
using it as an index (e.g., call .item() or int(...) on the result of
idp.untransform(atom_types[i].reshape(-1))) so ase.data.chemical_symbols[...]
yields a Python string, and ensure the list atom_symbols contains those hashable
strings (so set(atom_symbols) will work).

coderabbitai · 2026-01-24T08:36:53Z

dptb/data/interfaces/ham_to_feature.py

+        # Pre-compute atom symbols for all atoms (vectorized lookup)
+        atom_types = data[_keys.ATOM_TYPE_KEY]
+        atom_symbols = [
+            ase.data.chemical_symbols[idp.untransform(atom_types[i].reshape(-1))]
+            for i in range(len(atom_types))
+        ]


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, find the untransform function definition rg -n "def untransform" -A 5 dptb/data/

Repository: deepmodeling/DeePTB

Length of output: 1445

🏁 Script executed:

# Also check imports in ham_to_feature.py to understand idp rg -n "import.*idp\|from.*idp" dptb/data/interfaces/ham_to_feature.py

Repository: deepmodeling/DeePTB

Length of output: 45

🏁 Script executed:

# Check the actual context around lines 360-365 sed -n '350,375p' dptb/data/interfaces/ham_to_feature.py

Repository: deepmodeling/DeePTB

Length of output: 1072

🏁 Script executed:

# Search for ATOM_TYPE_KEY definition and its type rg -n "ATOM_TYPE_KEY" dptb/data/

Repository: deepmodeling/DeePTB

Length of output: 2262

🏁 Script executed:

# Search for other uses of untransform and how the result is used rg -n "untransform\(" -B2 -A2 dptb/data/interfaces/ham_to_feature.py | head -50

Repository: deepmodeling/DeePTB

Length of output: 1202

🏁 Script executed:

# Check if .item() is used elsewhere in the file rg -n "\.item\(\)" dptb/data/interfaces/ham_to_feature.py

Repository: deepmodeling/DeePTB

Length of output: 45

🏁 Script executed:

# Look for ase.data.chemical_symbols usage patterns rg -n "ase\.data\.chemical_symbols" dptb/

Repository: deepmodeling/DeePTB

Length of output: 1306

🏁 Script executed:

# Check what type atom_types actually is and how it flows sed -n '355,370p' dptb/data/interfaces/ham_to_feature.py

Repository: deepmodeling/DeePTB

Length of output: 634

🏁 Script executed:

# Check how atomic_numbers is typically used when obtained from untransform sed -n '75,85p' dptb/data/interfaces/ham_to_feature.py

Repository: deepmodeling/DeePTB

Length of output: 565

🏁 Script executed:

# Look at the broader context - is ATOMIC_NUMBERS_KEY available at line 360? sed -n '320,370p' dptb/data/interfaces/ham_to_feature.py

Repository: deepmodeling/DeePTB

Length of output: 2079

🏁 Script executed:

# Check if there are any tests for this code path find dptb -name "*test*.py" -type f | xargs rg -l "ham_to_feature\|feature_to_block" 2>/dev/null

Repository: deepmodeling/DeePTB

Length of output: 47

🏁 Script executed:

# Verify the exact type of untransform return value and how .reshape(-1) behaves rg -n "def untransform" -A 10 dptb/data/transforms.py | head -30

Repository: deepmodeling/DeePTB

Length of output: 1326

🏁 Script executed:

# Check if the code at line 363 is actually executed - look for test patterns rg -n "feature_to_block" dptb/ -A 3 | grep -E "test|def|feature_to_block"

Repository: deepmodeling/DeePTB

Length of output: 4322

🏁 Script executed:

# Try to understand what atom_types[i].reshape(-1) produces and if it can be indexed python3 << 'EOF' import torch # Simulate what happens atom_types = torch.tensor([[1], [2], [3]]) # Example shape result = atom_types[0].reshape(-1) # This would be a 1D tensor of shape [1] print(f"Result shape: {result.shape}") print(f"Result type: {type(result)}") print(f"Result dtype: {result.dtype}") # Try to use this as an index into a tuple try: test_tuple = ("H", "C", "N", "O") index = result val = test_tuple[index] print(f"Indexing worked: {val}") except Exception as e: print(f"Error when indexing tuple with tensor: {type(e).__name__}: {e}") # Try with .item() try: val = test_tuple[result.item()] print(f"Indexing with .item() worked: {val}") except Exception as e: print(f"Error with .item(): {e}") EOF

Repository: deepmodeling/DeePTB

Length of output: 149

🏁 Script executed:

# Check how atom_types are structured when passed to feature_to_block rg -n "feature_to_block\(" dptb/ -B 5 | head -40

Repository: deepmodeling/DeePTB

Length of output: 2692

🏁 Script executed:

# Check the test more carefully to see if it actually hits the atom_symbols code sed -n '80,100p' dptb/tests/test_block_to_feature.py

Repository: deepmodeling/DeePTB

Length of output: 1189

🏁 Script executed:

# Let me verify what happens when you try to index with a tensor python3 << 'EOF' import ase.data # Simulate the exact operation from the code test_index = __import__('torch').tensor([6]) # Carbon has atomic number 6 try: result = ase.data.chemical_symbols[test_index] print(f"Success: {result}") except TypeError as e: print(f"TypeError: {e}") except Exception as e: print(f"{type(e).__name__}: {e}") # Try with .item() try: result = ase.data.chemical_symbols[test_index.item()] print(f"With .item() success: {result}") except Exception as e: print(f"With .item() error: {e}") EOF

Repository: deepmodeling/DeePTB

Length of output: 178

🏁 Script executed:

# Check if there were recent changes to this function git log --oneline -n 20 -- dptb/data/interfaces/ham_to_feature.py

Repository: deepmodeling/DeePTB

Length of output: 116

🏁 Script executed:

# Check the recent commit to understand what changed git show d97a847 --stat

Repository: deepmodeling/DeePTB

Length of output: 50375

🏁 Script executed:

# Get more details about the specific change git show d97a847 | head -100

Repository: deepmodeling/DeePTB

Length of output: 3455

🏁 Script executed:

# Look at the actual error - try to understand the code path better rg -n "atom_symbols\[" dptb/data/interfaces/ham_to_feature.py

Repository: deepmodeling/DeePTB

Length of output: 202

🏁 Script executed:

# Check if set(atom_symbols) is actually called and what could fail sed -n '360,380p' dptb/data/interfaces/ham_to_feature.py

Repository: deepmodeling/DeePTB

Length of output: 947

🏁 Script executed:

# Check if the test actually covers atom_symbols code path - look at test data sed -n '85,105p' dptb/tests/test_block_to_feature.py

Repository: deepmodeling/DeePTB

Length of output: 1395

🏁 Script executed:

# Verify if ATOMIC_NUMBERS_KEY is set before feature_to_block is called sed -n '30,50p' dptb/tests/test_block_to_feature.py

Repository: deepmodeling/DeePTB

Length of output: 541

🏁 Script executed:

# Check the actual structure to understand data flow before feature_to_block sed -n '1,50p' dptb/tests/test_block_to_feature.py

Repository: deepmodeling/DeePTB

Length of output: 1480

🏁 Script executed:

# Try to simulate what happens when indexing with a tensor python3 << 'EOF' # Verify that tensor indexing on tuple fails try: import torch test_tuple = ("H", "C", "N", "O") tensor_idx = torch.tensor([2]) result = test_tuple[tensor_idx] print(f"Success (unexpected): {result}") except TypeError as e: print(f"TypeError (expected): {e}") except Exception as e: print(f"Other error: {type(e).__name__}: {e}") # Test with .item() try: result = test_tuple[tensor_idx.item()] print(f"With .item() success: {result}") except Exception as e: print(f"With .item() error: {e}") EOF

Repository: deepmodeling/DeePTB

Length of output: 173

🏁 Script executed:

# Check if ATOMIC_NUMBERS_KEY is being used anywhere in feature_to_block or its callsites sed -n '322,450p' dptb/data/interfaces/ham_to_feature.py | grep -n "ATOMIC_NUMBERS\|atom_symbols"

Repository: deepmodeling/DeePTB

Length of output: 275

Fix tensor indexing when deriving atom symbols from idp.untransform().

idp.untransform() returns a tensor, which cannot index the ase.data.chemical_symbols tuple. This will cause a TypeError at runtime. Convert the tensor to an integer scalar before indexing, and ensure set(atom_symbols) receives hashable values.

♻️ Suggested fix

atom_types = data[_keys.ATOM_TYPE_KEY] atom_symbols = [ - ase.data.chemical_symbols[idp.untransform(atom_types[i].reshape(-1))] + ase.data.chemical_symbols[int(idp.untransform(atom_types[i]).item())] for i in range(len(atom_types)) ]

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Pre-compute atom symbols for all atoms (vectorized lookup)

atom_types = data[_keys.ATOM_TYPE_KEY]

atom_symbols = [

ase.data.chemical_symbols[idp.untransform(atom_types[i].reshape(-1))]

for i in range(len(atom_types))

]

# Pre-compute atom symbols for all atoms (vectorized lookup)

atom_types = data[_keys.ATOM_TYPE_KEY]

atom_symbols = [

ase.data.chemical_symbols[int(idp.untransform(atom_types[i]).item())]

for i in range(len(atom_types))

]

🤖 Prompt for AI Agents

In `@dptb/data/interfaces/ham_to_feature.py` around lines 360 - 365, The loop building atom_symbols is indexing ase.data.chemical_symbols with a tensor returned from idp.untransform(), causing a TypeError; fix it by converting the tensor to a Python integer scalar before using it as an index (e.g., call .item() or int(...) on the result of idp.untransform(atom_types[i].reshape(-1))) so ase.data.chemical_symbols[...] yields a Python string, and ensure the list atom_symbols contains those hashable strings (so set(atom_symbols) will work).

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-03T07:35:34Z

dptb/tests/test_feature_to_block_consistency.py

+operations and added device-side caching to avoid redundant transfers.
+"""
+
+import pytest


Import of 'pytest' is not used.

Suggested change

import pytest

Copilot · 2026-02-03T07:35:34Z

dptb/tests/test_feature_to_block_consistency.py

+from dptb.nn.nnsk import NNSK
+from dptb.data.transforms import OrbitalMapper
+from dptb.data.build import build_dataset
+from dptb.data import AtomicDataset, DataLoader, AtomicDataDict, AtomicData


Import of 'AtomicDataset' is not used.
Import of 'AtomicDataDict' is not used.

Suggested change

from dptb.data import AtomicDataset, DataLoader, AtomicDataDict, AtomicData

from dptb.data import DataLoader, AtomicData

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@dptb/tests/test_feature_to_block_consistency.py`:
- Around line 230-270: The test_performance_improvement benchmark is flaky:
switch timing from time.time() to time.perf_counter() for higher-resolution
clocks when measuring optimized_time and reference_time; gate the test with a
pytest skip/marker that checks an env var (e.g., skip unless
os.environ.get("RUN_PERF_TESTS") == "1") so it only runs in perf-aware CI or
locally; and change the unnecessary f-string print(f"\nPerformance comparison:")
to a normal print("\nPerformance comparison:"). Update references to the
measured functions feature_to_block and feature_to_block_reference and the
test_performance_improvement function accordingly.
- Around line 49-64: The non-overlap branch currently only sets blocks when
NODE_FEATURES_KEY exists and silently continues otherwise, risking an
UnboundLocalError; make it symmetric with the overlap branch by explicitly
checking for both _keys.NODE_FEATURES_KEY and _keys.EDGE_FEATURES_KEY at the
start of the overlap=False path, and if either is missing raise a KeyError
(similar to the overlap=True path that checks
_keys.NODE_OVERLAP_KEY/_keys.EDGE_OVERLAP_KEY); when both exist, assign
node_features, edge_features, set has_block = True and initialize blocks = {}.

🧹 Nitpick comments (1)

dptb/tests/test_feature_to_block_consistency.py (1)
147-194: Defer dataset construction to test setup to avoid import-time work and shared state.
Building datasets/loaders at class definition runs during collection (even when tests are deselected) and keeps mutable config dicts shared across tests. Moving this into setup_class or a fixture reduces collection cost and avoids accidental mutation leaks; consider ClassVar annotations if you keep class-level dicts.
♻️ Suggested refactor using setup_class
-    # Build test dataset
-    train_datasets = build_dataset(**data_options, **data_options["train"], **common_options)
-    train_loader = DataLoader(dataset=train_datasets, batch_size=1, shuffle=False)
-    batch = next(iter(train_loader))
-    batch = AtomicData.to_AtomicDataDict(batch)
-
-    # Create OrbitalMapper instances
-    idp = OrbitalMapper(basis=common_options['basis'], method="sktb")
+    `@classmethod`
+    def setup_class(cls):
+        cls.train_datasets = build_dataset(**cls.data_options, **cls.data_options["train"], **cls.common_options)
+        cls.train_loader = DataLoader(dataset=cls.train_datasets, batch_size=1, shuffle=False)
+        cls.batch = AtomicData.to_AtomicDataDict(next(iter(cls.train_loader)))
+        cls.idp = OrbitalMapper(basis=cls.common_options["basis"], method="sktb")

coderabbitai · 2026-02-03T07:38:34Z

dptb/tests/test_feature_to_block_consistency.py

+    has_block = False
+    if not overlap:
+        if data.get(_keys.NODE_FEATURES_KEY, None) is not None:
+            node_features = data[_keys.NODE_FEATURES_KEY]
+            edge_features = data[_keys.EDGE_FEATURES_KEY]
+            has_block = True
+            blocks = {}
+    else:
+        if data.get(_keys.NODE_OVERLAP_KEY, None) is not None:
+            node_features = data[_keys.NODE_OVERLAP_KEY]
+            edge_features = data[_keys.EDGE_OVERLAP_KEY]
+            has_block = True
+            blocks = {}
+        else:
+            raise KeyError("Overlap features not found in data.")
+


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

cat -n dptb/tests/test_feature_to_block_consistency.py | head -100

Repository: deepmodeling/DeePTB

Length of output: 5121

🏁 Script executed:

wc -l dptb/tests/test_feature_to_block_consistency.py

Repository: deepmodeling/DeePTB

Length of output: 114

🏁 Script executed:

cat -n dptb/tests/test_feature_to_block_consistency.py | sed -n '49,150p'

Repository: deepmodeling/DeePTB

Length of output: 5712

🏁 Script executed:

cat -n dptb/tests/test_feature_to_block_consistency.py | sed -n '140,200p'

Repository: deepmodeling/DeePTB

Length of output: 2142

🏁 Script executed:

cat -n dptb/tests/test_feature_to_block_consistency.py | sed -n '195,250p'

Repository: deepmodeling/DeePTB

Length of output: 2855

🏁 Script executed:

cat -n dptb/tests/test_feature_to_block_consistency.py | sed -n '250,298p'

Repository: deepmodeling/DeePTB

Length of output: 2802

Make error handling symmetric for non-overlap and overlap feature paths.

The overlap=False path only initializes blocks when NODE_FEATURES_KEY exists, but silent skipping causes UnboundLocalError at the return statement if features are missing. The overlap=True path explicitly raises KeyError for missing features (line 63). Restructure the non-overlap path to check both features upfront and raise KeyError explicitly, matching the overlap path behavior and providing clearer failure modes.

Proposed fix

has_block = False if not overlap: - if data.get(_keys.NODE_FEATURES_KEY, None) is not None: - node_features = data[_keys.NODE_FEATURES_KEY] - edge_features = data[_keys.EDGE_FEATURES_KEY] - has_block = True - blocks = {} + node_features = data.get(_keys.NODE_FEATURES_KEY) + edge_features = data.get(_keys.EDGE_FEATURES_KEY) + if node_features is None or edge_features is None: + missing = _keys.NODE_FEATURES_KEY if node_features is None else _keys.EDGE_FEATURES_KEY + raise KeyError(missing) + has_block = True + blocks = {}

🧰 Tools

🪛 Ruff (0.14.14)

[warning] 63-63: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents

In `@dptb/tests/test_feature_to_block_consistency.py` around lines 49 - 64, The non-overlap branch currently only sets blocks when NODE_FEATURES_KEY exists and silently continues otherwise, risking an UnboundLocalError; make it symmetric with the overlap branch by explicitly checking for both _keys.NODE_FEATURES_KEY and _keys.EDGE_FEATURES_KEY at the start of the overlap=False path, and if either is missing raise a KeyError (similar to the overlap=True path that checks _keys.NODE_OVERLAP_KEY/_keys.EDGE_OVERLAP_KEY); when both exist, assign node_features, edge_features, set has_block = True and initialize blocks = {}.

coderabbitai · 2026-02-03T07:38:35Z

dptb/tests/test_feature_to_block_consistency.py

+    def test_performance_improvement(self):
+        """
+        Benchmark test to verify that optimized implementation is faster
+        than the reference implementation.
+        """
+        # Create NNSK model and generate features
+        hamiltonian = SKHamiltonian(idp_sk=self.idp, onsite=True)
+        nnsk = NNSK(**self.common_options, **self.model_options["nnsk"], transform=False)
+
+        with torch.no_grad():
+            data = nnsk(self.batch.copy())
+            data = hamiltonian(data)
+
+            # Warm-up runs
+            for _ in range(3):
+                _ = feature_to_block(data, nnsk.idp, overlap=False)
+                _ = feature_to_block_reference(data, nnsk.idp, overlap=False)
+
+            # Benchmark optimized implementation
+            num_runs = 10
+            start_time = time.time()
+            for _ in range(num_runs):
+                _ = feature_to_block(data, nnsk.idp, overlap=False)
+            optimized_time = (time.time() - start_time) / num_runs
+
+            # Benchmark reference implementation
+            start_time = time.time()
+            for _ in range(num_runs):
+                _ = feature_to_block_reference(data, nnsk.idp, overlap=False)
+            reference_time = (time.time() - start_time) / num_runs
+
+        speedup = reference_time / optimized_time
+        print(f"\nPerformance comparison:")
+        print(f"  Optimized: {optimized_time*1000:.3f} ms")
+        print(f"  Reference: {reference_time*1000:.3f} ms")
+        print(f"  Speedup: {speedup:.2f}x")
+
+        # Assert that optimized version is faster (or at least not significantly slower)
+        # Allow for some variance in timing measurements
+        assert optimized_time <= reference_time * 1.1, \
+            f"Optimized implementation is slower: {optimized_time:.4f}s vs {reference_time:.4f}s"


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Check if the file exists and get its line count fd -t f "test_feature_to_block_consistency.py" dptb/tests/ wc -l dptb/tests/test_feature_to_block_consistency.py 2>/dev/null || echo "File not found at expected path"

Repository: deepmodeling/DeePTB

Length of output: 162

🏁 Script executed:

# Read the specific lines mentioned in the review (230-270) sed -n '230,270p' dptb/tests/test_feature_to_block_consistency.py

Repository: deepmodeling/DeePTB

Length of output: 1885

🏁 Script executed:

# Check imports at the top of the file to see what's available head -30 dptb/tests/test_feature_to_block_consistency.py

Repository: deepmodeling/DeePTB

Length of output: 1013

🏁 Script executed:

# Check if pytest and os are imported rg "^import (time|pytest|os)" dptb/tests/test_feature_to_block_consistency.py

Repository: deepmodeling/DeePTB

Length of output: 98

Performance test assertion is likely to be flaky in CI.

Wall-clock timing with a tight 1.1× bound is noisy under variable system load. Use time.perf_counter() for benchmarking instead of time.time(), gate the test behind a pytest marker (e.g., skip unless RUN_PERF_TESTS=1), and remove the unused f-string prefix on the static print statement.

Stabilization changes

+ `@pytest.mark.skipif`(os.getenv("RUN_PERF_TESTS") != "1", reason="Performance checks are noisy; run in perf CI job.") def test_performance_improvement(self): """ Benchmark test to verify that optimized implementation is faster than the reference implementation. """ # Create NNSK model and generate features hamiltonian = SKHamiltonian(idp_sk=self.idp, onsite=True) nnsk = NNSK(**self.common_options, **self.model_options["nnsk"], transform=False) with torch.no_grad(): data = nnsk(self.batch.copy()) data = hamiltonian(data) # Warm-up runs for _ in range(3): _ = feature_to_block(data, nnsk.idp, overlap=False) _ = feature_to_block_reference(data, nnsk.idp, overlap=False) # Benchmark optimized implementation num_runs = 10 - start_time = time.time() + start_time = time.perf_counter() for _ in range(num_runs): _ = feature_to_block(data, nnsk.idp, overlap=False) - optimized_time = (time.time() - start_time) / num_runs + optimized_time = (time.perf_counter() - start_time) / num_runs # Benchmark reference implementation - start_time = time.time() + start_time = time.perf_counter() for _ in range(num_runs): _ = feature_to_block_reference(data, nnsk.idp, overlap=False) - reference_time = (time.time() - start_time) / num_runs + reference_time = (time.perf_counter() - start_time) / num_runs speedup = reference_time / optimized_time - print(f"\nPerformance comparison:") + print("\nPerformance comparison:")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def test_performance_improvement(self):

"""

Benchmark test to verify that optimized implementation is faster

than the reference implementation.

"""

# Create NNSK model and generate features

hamiltonian = SKHamiltonian(idp_sk=self.idp, onsite=True)

nnsk = NNSK(**self.common_options, **self.model_options["nnsk"], transform=False)

with torch.no_grad():

data = nnsk(self.batch.copy())

data = hamiltonian(data)

# Warm-up runs

for _ in range(3):

_ = feature_to_block(data, nnsk.idp, overlap=False)

_ = feature_to_block_reference(data, nnsk.idp, overlap=False)

# Benchmark optimized implementation

num_runs = 10

start_time = time.time()

for _ in range(num_runs):

_ = feature_to_block(data, nnsk.idp, overlap=False)

optimized_time = (time.time() - start_time) / num_runs

# Benchmark reference implementation

start_time = time.time()

for _ in range(num_runs):

_ = feature_to_block_reference(data, nnsk.idp, overlap=False)

reference_time = (time.time() - start_time) / num_runs

speedup = reference_time / optimized_time

print(f"\nPerformance comparison:")

print(f" Optimized: {optimized_time*1000:.3f} ms")

print(f" Reference: {reference_time*1000:.3f} ms")

print(f" Speedup: {speedup:.2f}x")

# Assert that optimized version is faster (or at least not significantly slower)

# Allow for some variance in timing measurements

assert optimized_time <= reference_time * 1.1, \

f"Optimized implementation is slower: {optimized_time:.4f}s vs {reference_time:.4f}s"

`@pytest.mark.skipif`(os.getenv("RUN_PERF_TESTS") != "1", reason="Performance checks are noisy; run in perf CI job.")

def test_performance_improvement(self):

"""

Benchmark test to verify that optimized implementation is faster

than the reference implementation.

"""

# Create NNSK model and generate features

hamiltonian = SKHamiltonian(idp_sk=self.idp, onsite=True)

nnsk = NNSK(**self.common_options, **self.model_options["nnsk"], transform=False)

with torch.no_grad():

data = nnsk(self.batch.copy())

data = hamiltonian(data)

# Warm-up runs

for _ in range(3):

_ = feature_to_block(data, nnsk.idp, overlap=False)

_ = feature_to_block_reference(data, nnsk.idp, overlap=False)

# Benchmark optimized implementation

num_runs = 10

start_time = time.perf_counter()

for _ in range(num_runs):

_ = feature_to_block(data, nnsk.idp, overlap=False)

optimized_time = (time.perf_counter() - start_time) / num_runs

# Benchmark reference implementation

start_time = time.perf_counter()

for _ in range(num_runs):

_ = feature_to_block_reference(data, nnsk.idp, overlap=False)

reference_time = (time.perf_counter() - start_time) / num_runs

speedup = reference_time / optimized_time

print("\nPerformance comparison:")

print(f" Optimized: {optimized_time*1000:.3f} ms")

print(f" Reference: {reference_time*1000:.3f} ms")

print(f" Speedup: {speedup:.2f}x")

# Assert that optimized version is faster (or at least not significantly slower)

# Allow for some variance in timing measurements

assert optimized_time <= reference_time * 1.1, \

f"Optimized implementation is slower: {optimized_time:.4f}s vs {reference_time:.4f}s"

🧰 Tools

🪛 Ruff (0.14.14)

[error] 262-262: f-string without any placeholders

Remove extraneous f prefix

(F541)

🤖 Prompt for AI Agents

In `@dptb/tests/test_feature_to_block_consistency.py` around lines 230 - 270, The test_performance_improvement benchmark is flaky: switch timing from time.time() to time.perf_counter() for higher-resolution clocks when measuring optimized_time and reference_time; gate the test with a pytest skip/marker that checks an env var (e.g., skip unless os.environ.get("RUN_PERF_TESTS") == "1") so it only runs in perf-aware CI or locally; and change the unnecessary f-string print(f"\nPerformance comparison:") to a normal print("\nPerformance comparison:"). Update references to the measured functions feature_to_block and feature_to_block_reference and the test_performance_improvement function accordingly.

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@dptb/tests/test_block_to_feature.py`:
- Line 448: The print statement uses an unnecessary f-string: replace the call
to print(f"\nPerformance comparison:") with a normal string print("\nPerformance
comparison:") to remove the unused f-string prefix (locate the print in
dptb/tests/test_block_to_feature.py where the output line "Performance
comparison:" is printed).

🧹 Nitpick comments (2)

dptb/tests/test_block_to_feature.py (2)
335-371: Consider annotating mutable class attributes with ClassVar.

The static analyzer flagged that common_options, model_options, and data_options should be annotated with typing.ClassVar since they are mutable class attributes. However, this pattern is consistent with the existing TestBlock2Feature class above, so this is a minor nitpick.

436-456: Consider using time.perf_counter() and watch for CI flakiness.

time.perf_counter() provides higher resolution timing than time.time() and is recommended for benchmarking.

The 10% tolerance assertion (optimized_time <= reference_time * 1.1) may be flaky in CI environments where system load varies. Consider either:

Marking this test with @pytest.mark.slow or similar to skip in CI

Removing the assertion and keeping only the informational print

Increasing the tolerance
♻️ Suggested change for perf_counter
-            start_time = time.time()
+            start_time = time.perf_counter()
             for _ in range(num_runs):
                 _ = feature_to_block(data, nnsk.idp, overlap=False)
-            optimized_time = (time.time() - start_time) / num_runs
+            optimized_time = (time.perf_counter() - start_time) / num_runs

             # Benchmark reference implementation
-            start_time = time.time()
+            start_time = time.perf_counter()
             for _ in range(num_runs):
                 _ = feature_to_block_reference(data, nnsk.idp, overlap=False)
-            reference_time = (time.time() - start_time) / num_runs
+            reference_time = (time.perf_counter() - start_time) / num_runs

coderabbitai · 2026-02-03T07:51:10Z

dptb/tests/test_block_to_feature.py

+            reference_time = (time.time() - start_time) / num_runs
+
+        speedup = reference_time / optimized_time
+        print(f"\nPerformance comparison:")


⚠️ Potential issue | 🟡 Minor

Remove unnecessary f-string prefix.

This f-string has no placeholders and should be a regular string.

🔧 Proposed fix

- print(f"\nPerformance comparison:") + print("\nPerformance comparison:")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

print(f"\nPerformance comparison:")

print("\nPerformance comparison:")

🧰 Tools

🪛 Ruff (0.14.14)

[error] 448-448: f-string without any placeholders

Remove extraneous f prefix

(F541)

🤖 Prompt for AI Agents

In `@dptb/tests/test_block_to_feature.py` at line 448, The print statement uses an unnecessary f-string: replace the call to print(f"\nPerformance comparison:") with a normal string print("\nPerformance comparison:") to remove the unused f-string prefix (locate the print in dptb/tests/test_block_to_feature.py where the output line "Performance comparison:" is printed).

Perf: accelerate feature_to_block with torch_scatter

904a27d

coderabbitai bot reviewed Dec 26, 2025

View reviewed changes

AsymmetryChou requested review from Copilot and floatingCatty December 26, 2025 04:58

Copilot started reviewing on behalf of AsymmetryChou December 26, 2025 06:19 View session

Copilot AI reviewed Dec 26, 2025

View reviewed changes

Feat: 优化 feature_to_block 函数，缓存设备传输的索引以减少冗余传输

d97a847

coderabbitai bot reviewed Jan 24, 2026

View reviewed changes

AsymmetryChou added 2 commits February 2, 2026 22:14

Merge branch 'main' into perf_feat2block

7d57851

feat(tests): 添加 feature_to_block 优化的一致性测试模块

2379700

AsymmetryChou requested a review from Copilot February 3, 2026 07:32

Copilot started reviewing on behalf of AsymmetryChou February 3, 2026 07:32 View session

Copilot AI reviewed Feb 3, 2026

View reviewed changes

coderabbitai bot reviewed Feb 3, 2026

View reviewed changes

feat(tests): mv feature_to_block tests to test_block_to_feature.py

826325c

coderabbitai bot reviewed Feb 3, 2026

View reviewed changes

floatingCatty approved these changes Feb 4, 2026

View reviewed changes

floatingCatty merged commit 00973af into deepmodeling:main Feb 4, 2026
4 checks passed

	from dptb.data import AtomicDataset, DataLoader, AtomicDataDict, AtomicData
	from dptb.data import DataLoader, AtomicData

	print(f"\nPerformance comparison:")
	print("\nPerformance comparison:")

Perf: accelerate feature_to_block with torch_scatter #302

Perf: accelerate feature_to_block with torch_scatter #302

Uh oh!

Conversation

AsymmetryChou commented Dec 26, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

AsymmetryChou commented Dec 26, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 26, 2025 •

edited

Loading