vserver development mailing list: AW: [vserver] Guest hourly/daily/... cron job parallel execution

From: Fiedler Roman <Roman.Fiedler_at_ait.ac.at>
Date: Thu 17 Jan 2013 - 10:05:25 GMT
Message-ID: <2ECE9D9EEF1F524185270138AE232659836A18@S0MSMAIL112.arc.local>

> -----Ursprüngliche Nachricht-----
> Von: Bendtsen, Jon [mailto:Jon.Bendtsen@laerdal.dk]
>
> On 17/01/2013, at 10.07, Fiedler Roman <Roman.Fiedler@ait.ac.at>
> wrote:
>
> >> -----Ursprüngliche Nachricht-----
> >> Von: Bendtsen, Jon [mailto:Jon.Bendtsen@laerdal.dk]
> >>
> >> On 17/01/2013, at 09.43, Fiedler Roman <Roman.Fiedler@ait.ac.at>
> wrote:
> >>
> >>> Hello List,
> >>>
> >>> What would be the most suitable solution to avoid high load/overlong
> cron
> >> job execution for hourly/daily/... jobs? When using similar vservers, all of
> >> them would start those jobs at the same time.
> >>
> >> You could manually reschedule them?
> >
> > Manual reschedule is annoying, you need to estimate the run-time and
> make some job-start-plan to know free slots.
>
> Or maybe just pick some random time, sort of what old ethernet did when
> there is a collision because 2 or more computers tried to transmit data at the
> same time.

Yes. Or use the context-ID modulo something.

> >> Or use some kind of vserver <name>
> >> exec cat /etc/crontab to find out when they are run and change them to
> run
> >> at a different time?
> >
> > From my point of view, the 2 main advantages of this are:
> > * Easy implementable, easy understandable solution
> > * Lowest security impact/security-relevance of code
> >
> > The problem here ist the run-time estimate: Let assume, you have jobs in
> every guest, that need one CPU-core for 2min and 2 cores. So if 2 run, the
> complete in 2min, if 3 run in parallel they would need 3min, but due to more
> parallel disk IO, decreasing cache efficiency execution time could be more
> like 3:20. If stacking is too narrow, jobs deviating from estimate or additional
> jobs on guest or host are likely to make some jobs that slow, that execution
> overlaps with the next, thus decreasing the performance even further, thus
> starting more jobs in parallel .
>
> I see your point. If so maybe we should split up daily/weekly/hourly cronjobs
> even further, such that all the same programs which are hard linked together
> can use the same cache and ram for instructions. But running locate to update
> the database might not be a good choice to all run at the same time.

I would expect, that caching of the application code itself is not the main performance boost. It is more about getting rid of the bottle-necks. As you noted, too many locate-updates in parallel will kill disk performance. But also loosing the usual disk-cache benefit might be problematic. If e.g. just a few databases run a job, the relevant db-content from disk is likely the end up completely in OS RAM cache. All disk-reads from db will return immediately. When too many DBs execute in parallel, disk content of one process will be put in cache, eliminating pages soon need again by another job again.

> Maybe we need kind of scheduler in the kernel that notices which processes
> in the different guests are hard linked and then prioritizing running those?
>
> Maybe we need some new kind of scheduler system that is made with
> virtualization in mind, such that it adjusts to when there is a low load and then
> tries to run the maintenance scripts, meaning that sometimes scripts are run
> with only 20 hours between them, other times it might be 36 hours.

In my opinion, such an intelligent/learning scheduler would allow significance increase in execution performance, but looking of the simplicity of current cron, I think, that such a program is years away, if even written ever.

> > On the other hand, if distance between jobs is quite large, the last guest
> daily jobs will run quite long after the first guest jobs have completed. So it
> might get narrow if all jobs should be finished between 0:00 and start of
> business hours.
>
> Why start at 0:00? most people leave the office during the afternoon before
> the evening. I start my backup scripts at 18:00, but some people run a
> 24x7x365 business, so they can not just pick an off main hour schedule to do
> their business.

You are right, that's just something specific here. I want to have files, that contain the data from the complete day before and I want to have them as soon as possible after midnight, so that risk of server-death in between is lower.

> >> But maybe just using nice on all the commands in the guests will raise the
> >> responsiveness, but not lower the load.
> >
> > This is sufficient to improve the customer-side performance but will
> decrease the execution efficiency of cron jobs even further. Would starting
> cron daemon with nice make all jobs run with the same nice-values?
>
> I am not starting cron daemon with nice. I put nice in side the crontab file, like
> these examples:
>
> 37 0 * * * root nice -n 15
> /usr/local/sbin/AD_integration/find_disabled_users_from_AD_in_groups.sh
> 0,15,30,45 * * * * root nice -n 5
> /usr/local/sbin/AD_integration/merge_AD_groups_with_unix.sh

Ah I see.

My current solution is:
* Check for each guest, if cron scheduler is installed inside
* Check if guest cron would run hourly/daily by himself, if yes let him do it. A misconfigured/malicious guest can always run any process at any time, this has to be addressed via other means
* If guest cron is installed, cron.daily/hourly ... directories exist but guest scheduler does NOT run those jobs from etc/crontab (cooperative guest) by itself, then start those via vserver-exec
* Make sure to run only a given number of those guest processes in parallel

So all you need to do is to install cron in guest but remove run-scripts directives for hourly/daily from /etc/crontab to opt in.
Received on Thu Jan 17 10:05:34 2013