Skip to content

Conversation

@laurentsimon
Copy link
Contributor

@laurentsimon laurentsimon commented Jan 9, 2024

closes #666

Let me know if you'd like me to add a dedicated unit test for this new option.

Context: I'm working on adding a new hash type to rekor to support hashing directories. I will need to follow up to add support for it in https://github.com/trailofbits/sigstore-rekor-types.

Test example with pre-computed sha256:

from sigstore import hashes as sigstore_hashes
from sigstore_protobuf_specs.dev.sigstore.common.v1 import HashAlgorithm

with file.open(mode="rb", buffering=0) as io:
    digest = hashlib.sha256(io.read())
    hashed = sigstore_hashes.Hashed(digest=digest, algorithm=HashAlgorithm.SHA2_256)
    result = signer.sign(hashed)

Summary

This PR enables passing a pre-hashed value instead of the content to be hashed.

Release Note

Documentation

@laurentsimon laurentsimon changed the title feat: WIP Enable custom hash feat: Enable custom hash Jan 10, 2024
@laurentsimon laurentsimon changed the title feat: Enable custom hash feat: Enable signing with pre-computed hash Jan 10, 2024
@laurentsimon
Copy link
Contributor Author

tests seem to be failing because the bundle uses a different name for the hash algo (SHA2_256) instead of SHA256 for the standard crypto lib. I'll fix that while you review the PR.

@jku
Copy link
Member

jku commented Jan 10, 2024

If you run make reformat you'll start to get useful lint results as well

@laurentsimon
Copy link
Contributor Author

If you run make reformat you'll start to get useful lint results as well

I've run these commands and it works locally. However in CI it's failing. Any ideas?

@laurentsimon
Copy link
Contributor Author

I've made the changes, PTAL

@woodruffw woodruffw added component:signing Core signing functionality component:api Public APIs labels Jan 12, 2024
@woodruffw
Copy link
Member

I've run these commands and it works locally. However in CI it's failing. Any ideas?

Hmm, this might be an outdated local toolchain. What does ruff --version say?

sigstore/sign.py Outdated
Comment on lines 317 to 325
digest_algorithm: HashAlgorithm
"""
The digest algorithm to use for the hash.
"""

input_digest: HexStr
"""
The hex-encoded SHA256 digest of the input that was signed for.
The hex-encoded digest of the input that was signed for.
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NB: Both of these fields will disappear with #804 and we intend to remove SigningResult entirely with #854, so we should probably avoid more fields here. Given that HashAlgorithm will currently always be invariant, maybe we can just hardcode it with a NOTE for now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as soon as we update rekor (probably next week), it will no longer be an invariant, ie there wil be 2 supported hashes (sha256 and some dirhash hash)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha -- in that case, we should probably bump up the priority on #854, since SigningResult has outlived its utility in the world of Sigstore bundles 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM. So you'd land #854 then this PR I suppose? Or you don't really care?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tangent, but is dirhash documented anywhere? I see there's a dirhash standard on GitHub but I wasn't sure if it was the same thing.

(Part of what I'm wondering is what the mapping between a dirhash and a standard algorithm is -- how do we form a Prehashed for them?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question. We have an implementation in https://github.com/google/model-transparency/blob/main/model_signing/serialize.py#L325 and we're working on documenting it. It will go thru slight changes as we go thru revisions until it lands in SLSA or we release v1 of model-transparency.

To form a prehash, we'll do something like:

class DIRSHA256(HashAlgorithm):
    name = "dirsha256"
    digest_size = 32
    block_size = -1

This should only be supported for hashes pre-computed by user for now, I think. Otherwise we also need to define additional fields for computation https://github.com/pyca/cryptography/blob/00f8304a3dfe7a2aab6f3150a3c620e87d848044/src/cryptography/hazmat/primitives/hashes.py#L60.

So I think there's a path to supporting it directly in sigstore-python in the future.

Does that match your understanding too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That matches my understanding, thanks! We should probably make sure that subclassing HashAlgorithm like that for external hashes is okay in terms of PyCA's APIs, but otherwise that seems reasonable to me.

@laurentsimon
Copy link
Contributor Author

I updated to ruff 0.1.13. I managed to locally fix the files. CI still complaining.

@woodruffw
Copy link
Member

I updated to ruff 0.1.13. I managed to locally fix the files. CI still complaining.

Gotcha, sorry for the mess here -- I'll do some debugging on the CI today. It looks like lint is still failing, although this time because we have public APIs without docstrings.

@laurentsimon
Copy link
Contributor Author

@woodruffw out of curiosity, what is the timeline for v3 release https://github.com/sigstore/sigstore-python/milestone/5?

@woodruffw
Copy link
Member

@woodruffw out of curiosity, what is the timeline for v3 release https://github.com/sigstore/sigstore-python/milestone/5?

I don't think we have a fixed timeline yet, it's mostly just a matter of when DSSE support and this get merged in, along with a few CLI deprecations.

(To speed things along, we could definitely do one or more v3 prereleases, if you're looking to consume the APIs downstream.)

@laurentsimon
Copy link
Contributor Author

@woodruffw out of curiosity, what is the timeline for v3 release https://github.com/sigstore/sigstore-python/milestone/5?

I don't think we have a fixed timeline yet, it's mostly just a matter of when DSSE support and this get merged in, along with a few CLI deprecations.

(To speed things along, we could definitely do one or more v3 prereleases, if you're looking to consume the APIs downstream.)

That'd be nice yes! I'd be interested in a prelrelease with this PR merged in, so we can start testing e2e on our side. What would be a timeline for it?

@woodruffw
Copy link
Member

That'd be nice yes! I'd be interested in a prelrelease with this PR merged in, so we can start testing e2e on our side. What would be a timeline for it?

We could make that pre-release as soon as this gets merged in 🙂

(We won't make any semver guarantees around pre-releases, but I think that's already given.)

@laurentsimon
Copy link
Contributor Author

That'd be nice yes! I'd be interested in a prelrelease with this PR merged in, so we can start testing e2e on our side. What would be a timeline for it?

We could make that pre-release as soon as this gets merged in 🙂

(We won't make any semver guarantees around pre-releases, but I think that's already given.)

I'd probably just ask that the rekor changes are in too. But I'll let you know once this is done.

@laurentsimon
Copy link
Contributor Author

I'll wait for #862 to land and update this PR. Anything else I should address in the PR?

Signed-off-by: laurentsimon <laurentsimon@google.com>
Signed-off-by: laurentsimon <laurentsimon@google.com>
@laurentsimon
Copy link
Contributor Author

laurentsimon commented Jan 17, 2024

I've rebased. However I was not able to test locally, keeps getting this error:

spec=rekor_types.hashedrekord.HashedrekordV001Schema( ^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: module 'rekor_types' has no attribute 'hashedrekord'. Did you mean: 'Hashedrekord'?

I did install python3 -m pip install sigstore-rekor-types==0.0.11.. Any advice? How do you typically test?

@jku
Copy link
Member

jku commented Jan 18, 2024

How do you typically test?

the makefile tries to manage a virtual env for testing:

  • this gets used by make test or can be manually generated by make dev
  • once the env exists exists you can enter it for manual testing with source env/bin/activate

FYI, the dependency version pinning is a bit strange:

  • pyproject.toml documents the actual required runtime versions
  • install/requirements.txt documents pinned versions -- but this is only updated at release time (I think this is a bit confusing and I'm not sure what the purpose is)

For sigstore-rekor-types: you want 0.0.12 as documented by pyproject.toml. The virtual environment should already have this version.

Signed-off-by: laurentsimon <laurentsimon@google.com>
Signed-off-by: laurentsimon <laurentsimon@google.com>
Signed-off-by: laurentsimon <laurentsimon@google.com>
@laurentsimon
Copy link
Contributor Author

How do you typically test?

the makefile tries to manage a virtual env for testing:

  • this gets used by make test or can be manually generated by make dev
  • once the env exists exists you can enter it for manual testing with source env/bin/activate

FYI, the dependency version pinning is a bit strange:

  • pyproject.toml documents the actual required runtime versions
  • install/requirements.txt documents pinned versions -- but this is only updated at release time (I think this is a bit confusing and I'm not sure what the purpose is)

For sigstore-rekor-types: you want 0.0.12 as documented by pyproject.toml. The virtual environment should already have this version.

Thank you, that was super helpful. I was able to make the changes and the conformance tests are now passing. PTAL.

Copy link
Member

@woodruffw woodruffw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great to me @laurentsimon! One last question, and then I think we're good to merge here.

Signed-off-by: laurentsimon <laurentsimon@google.com>
Signed-off-by: laurentsimon <laurentsimon@google.com>
@woodruffw
Copy link
Member

/gcbrun

@woodruffw
Copy link
Member

LGTM, although lint is still failing. @laurentsimon are you able to diagnose that? Otherwise I can try and do it, if you give me push access to your branch.

@laurentsimon
Copy link
Contributor Author

LGTM, although lint is still failing. @laurentsimon are you able to diagnose that? Otherwise I can try and do it, if you give me push access to your branch.

linter passes locally, so I think you're in a better position to debug that part :)
You have write access to the PR branch, I think

Signed-off-by: William Woodruff <william@trailofbits.com>
Signed-off-by: William Woodruff <william@trailofbits.com>
Signed-off-by: William Woodruff <william@trailofbits.com>
@woodruffw
Copy link
Member

/gcbrun

@woodruffw
Copy link
Member

Fixed up the CI (sorry for the pain here, I need to figure out what's going on).

This LGTM, but @jku or another maintainer will need to approve now that I'm the last pusher. Given that, I'll spend some time writing some tests here.

Signed-off-by: William Woodruff <william@trailofbits.com>
@woodruffw
Copy link
Member

/gcbrun

Comment on lines +127 to +148
@pytest.mark.online
@pytest.mark.ambient_oidc
def test_sign_prehashed(staging):
sign_ctx, verifier, identity = staging

sign_ctx: SigningContext = sign_ctx()
verifier: Verifier = verifier()

input_ = io.BytesIO(secrets.token_bytes(32))
hashed = Hashed(digest=sha256_streaming(input_), algorithm=HashAlgorithm.SHA2_256)

with sign_ctx.signer(identity) as signer:
bundle = signer.sign(hashed)

assert bundle.message_signature.message_digest.algorithm == hashed.algorithm
assert bundle.message_signature.message_digest.digest == hashed.digest

input_.seek(0)
materials = VerificationMaterials.from_bundle(
input_=input_, bundle=bundle, offline=False
)
verifier.verify(materials=materials, policy=UnsafeNoOp())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added this to confirm that the round-trip works as expected: we sign over a Hashed(...) and then verify over the pre-image.

@woodruffw woodruffw requested a review from jku January 18, 2024 19:50
Copy link
Member

@jku jku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@jku jku merged commit b3fdb31 into sigstore:main Jan 19, 2024
@woodruffw
Copy link
Member

Thanks a ton for your hard work here @laurentsimon!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component:api Public APIs component:signing Core signing functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Provide API that take in a hash instead of io bytes

3 participants