[CELEBORN-2259] The S3MultipartUploadHandler uses fs.s3a.aws.credentials.provider#3599
[CELEBORN-2259] The S3MultipartUploadHandler uses fs.s3a.aws.credentials.provider#3599Dzeri96 wants to merge 1 commit intoapache:mainfrom
Conversation
|
@xy2953396112 @FMX @SteNicholas can you please take a look ? |
|
|
||
| static AWSCredentialProviderList getCredentialsProvider(URI binding, Configuration conf) | ||
| throws IOException { | ||
| return S3AUtils.createAWSCredentialProviderSet(binding, conf); |
There was a problem hiding this comment.
the API is deleted in HADOOP-18073. S3A: Upgrade AWS SDK to V2
|
change makes sense, the only concern is that the API is removed in Hadoop 3.4.0 |
|
Upgrading AWS SDK to V2 is a much bigger change, I think it is better to do it as a separate step, and I guess it will have some other impact. The current solution does not require to copy/paste code from Hadoop codebase, as as the dependency is in the pom, when we upgrade the SDK we will also upgrade this code what do you think @pan3793 ? |
I'm fine with that, but please at least leave some comments there. BTW, Hadoop 3.4.3 (under RC phase) will be the first version that works with JDK 25, it's likely that Celebron will upgrade it soon. |
| ExclusionRule("com.amazonaws", "aws-java-sdk-bundle")) | ||
| val awsS3 = "com.amazonaws" % "aws-java-sdk-s3" % awsS3Version | ||
| // Needed for com.amazonaws.auth.WebIdentityTokenCredentialsProvider | ||
| val awsSTS = "com.amazonaws" % "aws-java-sdk-sts" % awsS3Version |
There was a problem hiding this comment.
add a new dep to the binary release tarball? don't forget to update LICENSE-binary/NOTICE-binary
| AWSCredentialProviderList providers = | ||
| S3MultipartUploadHandler.getCredentialsProvider(null, conf); | ||
| Assert.assertTrue(providers.size() != 0); | ||
| System.out.println(providers.listProviderNames()); |
There was a problem hiding this comment.
BTW, I guess the default value matches the existing behavior, right?
What changes were proposed in this pull request?
The S3 Client in
S3MultipartUploadHandlernow uses the dynamic configfs.s3a.aws.credentials.providerin order to set its provider chain up.Why are the changes needed?
Before this, it was only possible to use the hard-coded provider configuration.
Does this PR resolve a correctness bug?
Sort of.
Does this PR introduce any user-facing change?
Yes, in the sense that
celeborn.hadoop.fs.s3a.aws.credentials.providerwill now work correctly in the MultiPartHandler.How was this patch tested?
Unit tests and a manual test.
Note: I don't like having to change the class in order to make it testable, but I'm planning to get rid of this whole logic in another PR, where we will use the same hadoop-created S3 client everywhere.