Skip to content

Conversation

@jamielamont
Copy link

Created new orchestrator process
Setup configurable lower and upper limits
Implemented scale up and scale down functions
Function to auto scale up upon startup to meet lower limit set in config
Ability to call orchestrator to scale up and down from gateway

addproc.sh Outdated
@@ -0,0 +1,66 @@
#!/bin/bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could addproc and remove proc be consolidated into a single bash script? Perhaps using flags for up and down? Would be nicer to have a scaleproc.sh script that handles both scenarios

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this would be a lot tidier. I'll get working on this.

// Join active .gw.servers to .servers.SERVERS table
activeservers:{lj[select from .gw.servers where active;`handle xcol `w xkey .servers.SERVERS]}

/function to tell orchestrator to scale up
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code here is duplicated across both. Could you have a generic function to cover both scenarios and either use params or projections?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were still working on adjusting the gateway code with differences between scaling up and down and this made it simpler. It can be a generic function with these changes added, I'll give it a look.

scalingdetails:([] time:`timestamp$(); procname:`$(); dir:`$(); instancecreated:`$(); instanceremoved:`$(); totalnumofinstances:`int$(); lowerlimit:`int$(); upperlimit:`int$()); /table for tracking scaling

processlimitscsv:hsym first .proc.getconfigfile"processlimits.csv"; /location of csv file
limits:1!("SII";enlist ",")0:processlimitscsv; /table of scalable processes and the max number of instances allowed for each
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the key of the file? 1! doesn't give an indication, I think I'd prefer use of xkey here given we don't know the file.

2 other points - is there already csv loading functionality in TorQ? Could it be leveraged? For example what happens if file doesn't exist here? OR if the file load fails? We need to handle those scenarios.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok cool I will have a look for this functionality

/initialises connection to discovery process and creates keyed table containing the number of instances of each scalable process
getscaleprocsinstances:{[]
.servers.startup[];
`.orch.procs set string@/:exec procname from .servers.procstab;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these stringed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the next line (Line 17) there is a lambda which gets the count of instances for each scalable process. This is done by checking which of the strings in the .orch.procs list are like the scalable process. E.g. {sum procs like "rdb1*"} will recognize all instances of rdb1 i.e. rdb1.2 rdb1.3 and so on

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if there's lots of rdbs and we have an rdb10, would that be detected as a replicated instance of rdb1?

I also don't think you need the @/: here?

Copy link
Author

@jamielamont jamielamont Dec 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes currently it would be detected so will need fixed and yes it's not needed, I see now 'like' accepts string or symbol in left operand. Thanks

getscaleprocsinstances[];

/function to scale up a process
scaleup:{[procname]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again scaleup and scaledown here there looks to be a lot of duplication in the code. Potentially parameterize or project.

];
}

initialscaling@/:scaleprocslist;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this doing?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calls the intialscaling function for each of the scalable processes to ensure they have been scaled up to their lower limits on startup

/initialises connection to discovery process and creates keyed table containing the number of instances of each scalable process
getscaleprocsinstances:{[]
.servers.startup[];
`.orch.procs set string@/:exec procname from .servers.procstab;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if there's lots of rdbs and we have an rdb10, would that be detected as a replicated instance of rdb1?

I also don't think you need the @/: here?

Comment on lines 24 to 31
$[scaleprocsinstances[procname;`instances]<limits[procname;`upper];
[system "bash ${TORQHOME}/addproc.sh ",string procname;
/update number of process instances
getscaleprocsinstances[];
/update table with record for scaling up
`.orch.scalingdetails upsert (.z.p;procname;`up;`$(last procs where procs like string[procname],"*");`;scaleprocsinstances[procname;`instances];limits[procname;`lower];limits[procname;`upper])];
.lg.o[`scale;"upper limit hit for ",string procname]
];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multi-line $ with blocks in them are confusing IMO. I'd reverse the condition & early exit in an if e.g.

Suggested change
$[scaleprocsinstances[procname;`instances]<limits[procname;`upper];
[system "bash ${TORQHOME}/addproc.sh ",string procname;
/update number of process instances
getscaleprocsinstances[];
/update table with record for scaling up
`.orch.scalingdetails upsert (.z.p;procname;`up;`$(last procs where procs like string[procname],"*");`;scaleprocsinstances[procname;`instances];limits[procname;`lower];limits[procname;`upper])];
.lg.o[`scale;"upper limit hit for ",string procname]
];
if[scaleprocsinstances[procname;`instances]>=limits[procname;`upper];
.lg.o[`scale;"upper limit hit for ",string procname];
:();
];
system "bash ${TORQHOME}/addproc.sh ",string procname;
/update number of process instances
getscaleprocsinstances[];
/update table with record for scaling up
`.orch.scalingdetails upsert (.z.p;procname;`up;`$(last procs where procs like string[procname],"*");`;scaleprocsinstances[procname;`instances];limits[procname;`lower];limits[procname;`upper]);


/function to tell orchestrator to scale up
scaleup:{[procname]
handle:first exec w from .servers.SERVERS where proctype=`orchestrator;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could/should probably use .servers.gethandlebytype here

Suggested change
handle:first exec w from .servers.SERVERS where proctype=`orchestrator;
handle:.servers.gethandlebytype[`orchestrator;`any];

(this will mean that if for whatever reason connection to orchestrator has been lost, it will attempt to re-open it first)

Comment on lines 48 to 54
/function to ensure all processes have been scaled up to meet lower limit
initialscaling:{[procname]
if[scaleprocsinstances[procname;`instances]<limits[procname;`lower];
reqinstances:limits[procname;`lower]-scaleprocsinstances[procname;`instances];
do[reqinstances;scaleup[procname]];
];
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's some indenting with spaces here, rest of file appears to be using tabs - make it consistent please

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants