Skip to content

Conversation

@brice-stacks
Copy link
Contributor

Improves the UX by merging validate-naka-block and validate-block, improves the lookup time by optimizing the queries, and improves the output for a better experience. Adds --early-exit flag to exit on first error instead of completing all blocks.

Improves the UX by merging `validate-naka-block` and `validate-block,`
improves the lookup time by optimizing the queries, and improves the
output for a better experience. Adds `--early-exit` flag to exit on
first error instead of completing all blocks.
@codecov
Copy link

codecov bot commented Dec 2, 2025

Codecov Report

❌ Patch coverage is 1.70213% with 231 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.95%. Comparing base (7ff75a2) to head (48db5d0).
⚠️ Report is 52 commits behind head on develop.

Files with missing lines Patch % Lines
contrib/stacks-inspect/src/lib.rs 0.00% 230 Missing ⚠️
stackslib/src/chainstate/stacks/db/blocks.rs 80.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #6735      +/-   ##
===========================================
+ Coverage    70.05%   71.95%   +1.89%     
===========================================
  Files          578      579       +1     
  Lines       358373   360160    +1787     
===========================================
+ Hits        251064   259140    +8076     
+ Misses      107309   101020    -6289     
Files with missing lines Coverage Δ
contrib/stacks-inspect/src/main.rs 0.00% <ø> (ø)
stackslib/src/chainstate/stacks/db/blocks.rs 82.66% <80.00%> (+10.09%) ⬆️
contrib/stacks-inspect/src/lib.rs 4.51% <0.00%> (-15.37%) ⬇️

... and 390 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7ff75a2...48db5d0. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

panic!("Failed to open staging blocks DB at {staging_blocks_db_path}: {e}");
});
let sql = format!(
"SELECT index_block_hash, consensus_hash, anchored_block_hash, height FROM staging_blocks {clause}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: since we are only consuming index_block_hash, maybe we can simplity the query as SELECT index_block_hash FROM staging_blocks {clause}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I had that there originally because I needed other fields when I was also reusing the new structure in other places, but then ended up deleting that code. I'll delete these now.

for (i, index_block_hash) in index_block_hashes.iter().enumerate() {
if i % 100 == 0 {
println!("Checked {i}...");
let sql = format!("SELECT index_block_hash, height FROM nakamoto_staging_blocks {clause}");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: since we are only consuming index_block_hash, maybe we can simplity the query as SELECT index_block_hash FROM nakamoto_staging_blocks {clause}

Comment on lines 232 to 233
let mut seen = HashSet::new();
let mut entries = Vec::new();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suuuuper nit: These two variables hold over 5 million index_block_hash values each when the function is called with BlockSelection::All, stored as 64 bytes hex strings. ~320 MB for each variable (plus additional overhead from the data structures). It's not a concern right now, but we could reduce memory usage by about half if we move the

let index_block_hash = StacksBlockId::from_hex(&index_block_hash_hex).unwrap();

processing inside this function and avoid keeping the hex strings around.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as always, feel free to skip the changes for the nits

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point. Also, a good reason to just remove the seen tracking altogether.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Remove the `seen` tracking since it doesn't really gain us much and uses
a lot of space when processing many blocks.
next_staging_block.commit_burn,
next_staging_block.sortition_burn,
);
Ok(())

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you meant to change replay_block to return a Result<,> and then return that result here?

Returning an unconditional Ok(()) regardless of what happens in replay_block (if it even ever returns -- it has some process::exits in there) doesn't seem like it's the intent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way it is now, replay_block only returns if it is successful. The error cases panic or process::exit, so it seems okay to do this here. You're probably right that it would be better to change that to return an error and pass that up.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So these would all be considered success cases?

println!("No header info found for {block_id}");
return;

println!("No microblock stream found for {block_id}");
return;

let msg = format!(
"Invalid stacks block {}/{} -- does not attach to parent {}/{}",
block_consensus_hash,
block.block_hash(),
parent_block_header.block_hash(),
&parent_header_info.consensus_hash
);
println!("{msg}");
return;

(This is an honest question -- I'm not familiar enough with all this, especially the pre-naka stuff, to judge this. Totally possible that these are all considered "valid" results in this context.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These would represent errors in the chainstate, so technically not what is intended to be tested here, but it would probably be nice to actually make these errors more noticeable, rather than just printing to stdout and returning. It wouldn't be the worst thing to refactor this to return an error.

This option adds complexity to the block selection across pre-Nakamoto /
Nakamoto blocks and it doesn't seem useful, so remove it. Also, make
`range` exclusive of the `end` height.
Copy link

@benjamin-stacks benjamin-stacks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left another comment about a question I'm not sure about, but in either case this change is an improvement on the status quo, so I'm happy to approve as is.

@jcnelson
Copy link
Member

I think @wileyj should try this patch out on a block-replay before merging, because the corresponding block validation script will need to be updated to use this new CLI.

That said, I'm kinda iffy on whether or not it was a good idea to put stacks-inspect into contrib if it's going to serve such a vital purpose in the release process. Is there a level of support we could have for stacks-inspect that's above what's in contrib, but without adding it to the bug bounty program? Maybe a question for @dhaney-stacks

@wileyj
Copy link
Collaborator

wileyj commented Dec 15, 2025

it's a block validation party today - will start once #6754 is complete.
regarding keeping this in contrib, as long as we can easily exclude stacks-inspect code from any big bounty program it could be moved back to the repo root or something.

@wileyj
Copy link
Collaborator

wileyj commented Dec 17, 2025

updated the validation script, and a few things are odd so far (not sure if it's bad or not, but it is behaving differently than stacks-inspect in 3.3.0.0.3 for example).
when validating pre-nakamoto blocks, the speed of earlier blocks finishes much faster, and it seems to be progressively slower for each process.
ex:
if i partition the total number of blocks and validation those blocks - the first blocks finish much faster.
100_000 blocks
10 partitions
10_000 blocks per partition
partition0 finishes quite fast (these would be the first blocks in the chain, probably smaller blocks without much activity - finishes in a few minutes)
partition9 will take much longer (larger blocks, more txs etc and takes a few hours)

for comparison, in 3.3.0.0.3 - each partition took roughly the same amount of time when using the (deprecated in this PR) function index-range, roughly 12_700s for each partition.

==> slice0.log <==
Command: /home/wileyj/stacks-inspect/target/release/stacks-inspect --config /home/wileyj/stacks-inspect/stackslib/conf/mainnet-follower-conf.toml validate-block  /home/wileyj/scratch/slice0 range 0 15469 2>/dev/null
Validating indexed blocks: 0-15469 (out of 185630)
Validating: 100% (16602/16602)

Finished validating 16602 blocks in 439s
0
==> slice1.log <==
Command: /home/wileyj/stacks-inspect/target/release/stacks-inspect --config /home/wileyj/stacks-inspect/stackslib/conf/mainnet-follower-conf.toml validate-block  /home/wileyj/scratch/slice1 range 15469 30938 2>/dev/null
Validating indexed blocks: 15469-30938 (out of 185630)
Validating:  53% (9004/16719)
==> slice2.log <==
... 
==> slice9.log <==
Command: /home/wileyj/stacks-inspect/target/release/stacks-inspect --config /home/wileyj/stacks-inspect/stackslib/conf/mainnet-follower-conf.toml validate-block  /home/wileyj/scratch/slice9 range 139221 154690 2>/dev/null
Validating indexed blocks: 139221-154690 (out of 185630)
Validating:   1% (300/17143)

i'm running a complete validation now, will share more details once it's complete and the logs are full.

wileyj added a commit to wileyj/stacks-core that referenced this pull request Dec 17, 2025
@wileyj
Copy link
Collaborator

wileyj commented Dec 17, 2025

attaching output from validating all blocks using range across 12 partitions.
logs.zip

as noted on the call, since hte blocks are processed serially - it makes more sense why the earlier blocks are validated much faster than the later blocks, and why some partitions take substantially longer than with version of stacks-inspect in develop branch (which doesn't process the blocks serially).

overall, a full run with from this branch took roughly 13 hours whereas from develop it was around 11 hours.

i will also introduce a block failure and re-run to see how the binary change logs validation failures

@wileyj
Copy link
Collaborator

wileyj commented Dec 17, 2025

@brice-stacks for readability - this is a little awkward. would be it possible to put the failure on a separate line?

note that this is a synthetic failure i introduced for epoch 2.5 to see what the output looks like

$ /home/wileyj/stacks-inspect/target/release/stacks-inspect --config /home/wileyj/stacks-inspect/stackslib/conf/mainnet-follower-conf.toml validate-block  /home/wileyj/scratch/slice0 range 107054 107056 2>/dev/null
Validating:  66% (2/3)Failed processing block! block = e60964eb18499310d30fe140725aa7f2d67d9aa6339ffd891cc06a36b6585ae0, error = InvalidStacksBlock("Block d442968758f7cf09927223bfcc86585adea0bd847f9438972ee2e8f33aa07d7f state root mismatch: expected ae1b7b0a25589259e13408e7e056e8d64e657da4305909bb34a3120396b49ddf, got b7795364840ba230f6bcd99d672760f17ce9b3a1208b7b09cb6eb32259e74b5a")

since the use case here is to not stop once a failure is discovered, it's helpful to have each one printed on their own line - that may be tricky with the change to a % based output though 🤔

@brice-stacks
Copy link
Contributor Author

I'm going to put the index-range option back in place and make sure that it works.

@brice-stacks brice-stacks marked this pull request as draft December 17, 2025 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants