-
Notifications
You must be signed in to change notification settings - Fork 23
Replace backbeat client with cloudserver client and migration aws sdk to v3 #2679
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: development/9.1
Are you sure you want to change the base?
Conversation
Hello sylvainsenechal,My role is to assist you with the merge of this Available options
Available commands
Status report is not available. |
6398439 to
269de9b
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files
... and 10 files with indirect coverage changes
@@ Coverage Diff @@
## development/9.1 #2679 +/- ##
===================================================
- Coverage 74.81% 73.48% -1.33%
===================================================
Files 201 200 -1
Lines 13579 13560 -19
===================================================
- Hits 10159 9965 -194
- Misses 3410 3585 +175
Partials 10 10
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
d06be27 to
11d7667
Compare
Waiting for approvalThe following approvals are needed before I can proceed with the merge:
|
5c4933f to
616c43b
Compare
| return done(err); | ||
| } | ||
|
|
||
| const backbeatClient = this.getBackbeatClient(accountId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side note : There are a lot of variables, function names and possibly file names that would benefit from a rename, from backbeatClient to cloudserverClient etc.
I don't wanna do it now because it will add a lot of noise to that PR, and it's not even that straightforward to do with this silly programming language
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to make it more palatable, think of it as Backbeat(Routes)Client for now...
616c43b to
dcd1111
Compare
404c946 to
897ee3e
Compare
| .then(data => { | ||
| sourceEntry.setReplicationSiteDataStoreVersionId(this.site, | ||
| data.versionId); | ||
| // TODO : review metadata metrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@francoisferrand I think using JSON.stringify will work, I see that the function .getSerialized() in Arsenal also returns a string from JSON.stringify that is then used in this same _publishMetadataWriteMetrics, but I'm not sure that "command.input" is equivalent to the old "destReq.httpRequest.body" data we used to call it with
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few occurences where this _publishMetadataWriteMetrics is an issue for me.
Actually, we can add a middleware to extract the request body like this, and pass it to the metrics just as before.
Although, I have found that here it would likely be useless, because as you can see in the command above, there is Body in the command, so the length would always be 0.
I have only found one occurence where we the request has a body
command.middlewareStack.add(
next => async args => {
const request = args.request as any;
console.log("content-length:", request?.headers?.['content-length'])
console.log("len", Buffer.byteLength(request.body as any))
return next(args);
},
{ step: 'finalizeRequest' }
);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more thought : With the old SDK, we were using destReq.httpRequest.body, but if you look at the command, there is no proper body parameter, and with the new v3 client, I tried to get the body with the middleware and it is undefined...
Anyways, probably not worth it to overthink this problem, we just need to clearly state what is the metric we want to compute first (I believe that here it's the Tags, since the api is putTagging), so we probably just want to do something like
this._publishMetadataWriteMetrics(JSON.stringify(command.input.Tags), writeStartTime)
3f6b0a3 to
fb1beba
Compare
| accountId = res.Account; | ||
| } catch (err) { | ||
| // Workaround a Vault issue on 8.3 branch | ||
| // https://scality.atlassian.net/browse/VAULT-238 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this still relevant? Please create ticket to look at it, and possibly clean it eventually.,...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😂 I was checking the same. We already have the ticket I guess...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added to the checklist
francoisferrand
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(still reviewing, but aleady some findings)
| if (err.code === 'ObjNotFound' || err.code === 'NoSuchBucket') { | ||
| return done(err, { committable: true }); | ||
| if (err.code === 'ObjNotFound' || err.code === 'NoSuchBucket' || | ||
| err.name === 'ObjNotFound' || err.name === 'NoSuchBucket') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no actual object type, as the errors are not "parsed"?
(should be added in CloudserverClient eventually, I guess)
We should create a ticket, and already "flag" the places which would benefit for this in backbeat with something like TODO: use error types once supported in CLDSRVCLT-5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I didn't pay much attention to proper smithy error generation, I'm adding this to my checklist
| // because it means the object has been deleted by other means and we don't need to retry | ||
| if (err.code === 'ObjNotFound' || err.code === 'NoSuchBucket') { | ||
| return done(err, { committable: true }); | ||
| if (err.code === 'ObjNotFound' || err.code === 'NoSuchBucket' || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need to check both code and name?
esp. since errors later check only name, does not seem consistent (either we need to check everywhere, or we can check nane only) ?
→ please review and align behavior where/if relevant
| logFields: { params }, | ||
| actionFunc: done => s3.getBucketLifecycleConfiguration(params, done), | ||
| actionFunc: done => { | ||
| const command = new GetBucketLifecycleConfigurationCommand(params); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we not attach request id here?
(maybe cleanup for later, but best flag it already with a TODO and ticket)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We talked about requestID earlier.
I think what we can do is leave it as it is with this pr (I kept request id where it was used, but didn't add it where it wasn't used).
Then follow up ticket to check if some apis should use it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but ew can already add comments pointing to this followup in the code: since it is relatively easy to identify those spots in the diff (as it is relatively small vs the whole codebase). It does not change the current PR behavior at all, but will give a starting point to look at when working on that followup ticket.
extensions/lifecycle/management.js
Outdated
|
|
||
| (async () => { | ||
| try { | ||
| const command = new DeleteBucketLifecycleCommand(params); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing request id
extensions/replication/management.js
Outdated
|
|
||
| (async () => { | ||
| try { | ||
| const command = new PutBucketVersioningCommand(params); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing requestId
| accountId = res.Account; | ||
| } catch (err) { | ||
| // Workaround a Vault issue on 8.3 branch | ||
| // https://scality.atlassian.net/browse/VAULT-238 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😂 I was checking the same. We already have the ticket I guess...
francoisferrand
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comments, but to sumup "major" issues:
- In quite a few places, code now calls the continuation callback from within the
tryblock. This is very dangerous (leading to duplicate callback code & hard to debug issues) and must be not be done : use callbackify() or Promise.then() instead - Some code can be dedup/simplified using callbackify() (e.g. common log or metrics in resolve & reject)
- “duplicated” CloudserverClient / S3Clent / IACClient setup, with lots of options & extra middleware : should be factorized with helper functions?
- Errors checked with both
err.codeanderr.name: seems weird, is this really needed? - RequestUID is not consistently passed ; not sure it can/should always be added, or how
getSerializedLogwill behave when not called from an API : but should be flagged to ease followup work - I still need to review the
AbortControllerand assocaited uses (CopyLocationTask/ReplicateObject). Now that we have fixed "underlying" issues with streaming, are these "deep" rework/refactor needed? Maybe worth checking, to minimize risk of breaking stuff by rewriting if not needed...
| } | ||
| return cb(null, roles[0], roles[1]); | ||
| }) | ||
| .catch(err => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: using utils.callbackify() instead of promise.then() allows to keep the same exact structure as before... or you can at least put the .catch block before .then
| return this.backbeatSource.getMetadata(params, (err, blob) => { | ||
| if (err) { | ||
| return this.backbeatSource.send(new GetMetadataCommand(params)) | ||
| .then(data => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same, .catch().then() make it easier to review
| }; | ||
| return this.backbeatSource.getMetadata(params, (err, blob) => { | ||
| if (err) { | ||
| return this.backbeatSource.send(new GetMetadataCommand(params)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing request id
| return doneOnce(err); | ||
| }); | ||
| const incomingMsg = sourceReq.createReadStream(); | ||
| attachReqUids(command, log); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replaces with a attribute in command param?
|
Should probably create a follow up ticket (edit : created : https://scality.atlassian.net/jira/software/c/projects/OS/boards/268?selectedIssue=BB-730)
Other things to check now for this ticket :
|
77722a8 to
561e37e
Compare
561e37e to
19e0184
Compare
francoisferrand
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reviewed the missing pices (CopyLocatoni & MultipleBackend), overall looks good, but:
- code is getting harder to follow as logic is partially (and artificially) reversed due to the mix of callback & promises with "shortcut" path: suggested approaches (early return, partial use of callbackify, or async.waterflow) which should allow keeping a natural flow
- in 2 cases, the abortController was introduced but it seems it never gets aborted
- need to keep metrics (even with TODO if we are not sure what size to pass), as they measure not only the size but also the number of calls and their duration
- lots of bebug logs to remove
| }; | ||
| return this._sendMultipleBackendPutObject( | ||
| actionEntry, objMD, size, incomingMsg, log, putDone); | ||
| return performPutObject(undefined); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a kind of pathologic case, where we can call _sendMultipleBackendPutObject directly
without much setup : would be much simpler to handle this case early, so we can see the "nominal"
case more easily, do not need to handle the extra case where incomingMsg is falsish, and avoid the
logic reordering...
| return performPutObject(undefined); | |
| if (size == 0) { | |
| return this._sendMultipleBackendPutObject(actionEntry, objMD, 0, "", log, doneOnce); | |
| } | |
| // continue with the nominal case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah early return is very good idea, dont know why i didnt do this earlier, it also helps to reduce nested code
| }, actionEntry.getLogInfo())); | ||
| return doneOnce(err); | ||
| } | ||
| log.error('an error occurred on getObject from S3', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
used not to be logged when aborted : should we not still skip the log in that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I rechecked this with the more global changes of this function
| let incomingMsg = null; | ||
| let aborted = false; | ||
|
|
||
| const abortController = new AbortController(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand how this abort controller is actually used/needed : since we never call abort()
(nor pass it to an AWS SDK api, so the SDK would call abort automatically)
It seems we just rely on incomingMsg.destroy() to trigger the abort if put fails...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the missing call to abort()
| .publish(); | ||
| return doneOnce(null, data); | ||
| }); | ||
| return performPutMPUPart(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same : best to handle the simpler "corner" case with an early return, so we can remove the anonymous function
and keep the "logical order of code.
| if (!sourceStreamAborted) { | ||
| sourceStreamAborted = true; | ||
| if (abortController) { | ||
| abortController.abort(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this actually needed?
error on incoming message would be handled already by the "get" command, which is the one listening
to the abortController.
In theory, calling it is best indeed (we never know if the caller of _putMPUPart may do), but I
wonder if this could "break" something to abort controller (and signaling get command to abort)
while already handling an error from the get command?
Did you confirm this case works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In any case, calling abort() twice is fine.
I think this logic is fine : With the get command, the catch handles errors such as bucket not found. Then once its successful, we receive a stream which might fail later, and this is what we are handling here
73cc9b4 to
4b14274
Compare
4b14274 to
4a61a73
Compare
| if (!sourceStreamAborted) { | ||
| sourceStreamAborted = true; | ||
| abortController.abort(); | ||
| // eslint-disable-next-line no-param-reassign | ||
| err.origin = 'source'; | ||
| if (err.$metadata?.httpStatusCode === 404) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we use to handle errors the other way:
if (httpStatusCode === 404) {
log.error();
return done(ObjectNotFound)
}
if (!aborted) {
log.error();
}
return done(err)
should the 404 error not be logged (and converted to ObjNotFound) even if the request was aborted?
| abortController.abort(); | ||
| sourceStreamAborted = true; | ||
| if (incomingMsg.destroy) { | ||
| incomingMsg.destroy(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we used to call abort() instead of destroy() : was destroy implied by abort, yet now we have a readable, so only destroy is available?
|
|
||
| const TIMEOUT_MS = 1000 * 60 * 2; // 2 minutes in ms | ||
|
|
||
| function attachReqUids(s3req, log) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is specific to cloudserver, so (eventually) this should be moved to CloudserverClient I guess.
In addition, we may add lightweight wrappers on top of "official" AWS SDK "Command" types which support an extra request uid parameter in constructor and automatically add middleware when set: something like
class ListObjectCommand extend aws.ListObjectCommand {
constructor(params) {
const stdParams = { ...params };
delete stdParams.requestUID;
super(stdParams);
if (params.requestUID) {
attachReqUids(this);
}
}
}
This may be a bit tedious and repetitive (though it may be simplified by relying on the fact that a constructor is really just a global function in javascript, so we may have a function which acts on it and does the wrapping), then we may use typescript meta-programming to "generate" the wrappers...
| } = data; | ||
|
|
||
| const response = { | ||
| versionList: [...(Versions || []), ...(DeleteMarkers || [])], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the "value" you get for Versions and DeleteMarkers ?
if undefined, then the fallback to [] is not needed, it is perfectly fine to "spread" undefined...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at [1], it seems we can also safely spreak null ; and empty arrays works fine anyway:
All primitives can be spread in objects. Only strings have enumerable own properties, and spreading anything else doesn't create properties on the new object.
[1] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Spread_syntax
francoisferrand
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still a few comments to clarify, but looks good overall
Issue: BB-706
sdk v3 migration + backbeat client migration