|
SoftTree Technologies
Technical Support Forums
|
|
Author |
Message |
davidyeung
Joined: 13 Feb 2023 Posts: 5 Country: Hong Kong |
|
24x7 Scheduler semaphore file and long running job |
|
I am using the 24x7 Scheduler Multi-platform Edition version 6.2 Build 657 and it is installed in Windows 10, and I have some questions.
1. I would like to setup a job that is triggered by semaphore file. For testing, I setup a job with javascript code below and try to get the actual file name triggering the job.
Scheduler.logAddMessage("INFO", @V"job_id", "@V"job_name"", "@V"files"");
The logged message is "[ERROR]". How can I get the correct file name that triggers the job?
2. Is is possible to setup a job that can monitor any long running job (for example more than 1 hour) and send email if found any long running jobs?
|
|
Wed Mar 01, 2023 4:40 am |
|
|
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7904
|
|
|
|
1. How do you test that job? If you start it manually, not waiting for a semaphore file, the logged value is going to be "[ERROR]" as it doesn't have/know the file that triggered the job.
2. "any" is not possible, but you can have 2 async jobs scheduled to start at the same time, kind of a worker and a watcher, and have the watcher literally sleep for 1 hour and check if the worker job delivered expected results. They need to be set in different queues or their mode set to asynchronous. Yet, that seems to be an overkill. Why not simply set a timeout value for the worker job to 1 hour and enable email notification for errors in its settings? That would cover timeouts too, and requires just a minimal input.
Also if you want to find out which jobs took longer than a hour to run, you can use a report for that purpose, and filter job execution log on job runs having end time minus start time greater than 1 hour.
|
|
Wed Mar 01, 2023 9:42 am |
|
|
davidyeung
Joined: 13 Feb 2023 Posts: 5 Country: Hong Kong |
|
|
|
Hi SysOP, thanks for your reply.
For 1, I am not triggering the job manually. I wait for the semaphore file and the job can be triggered. However, the message logged is "[ERROR]". Any other reason could cause this?
scheduler.log and debug.log content do not show anything interesting.
For 2, I tried setting a job with timeout set to 1 minute, and I put the wait function in the job to wait for 2 minutes. The job does terminates after 1 minute.
I also setup the "Notification actions and dependencies" section to create semaphore file for the "Job Error" event. However, when the job terminates after 1 minute, the file is not created.
It can create the file if I set it to "Job Finish" event instead. Anything I can try so that it can create a file when the job timeout?
|
|
Thu Mar 02, 2023 6:38 am |
|
|
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7904
|
|
|
|
Using the same sequence
1. Is the job set to run detached or not? Please try more time with Detached option unchecked.
2. When a job is terminated on timeout, it's not considered to be an error. There could be a valid use case when stopping a process after some time is by design. In the log you will see it as a warning, not an error. That's why On error notification won't work as you expected. If you want to do something specific on timeout, for example you can use script type job and notification action on job finish. In the job properties for dynamic variables step specify something like
|
|
{
"job_XYZ_state": "1"
} |
On the job startup the scheduler will create job_XYZ_state variable and initialize it to 1
In the job script at the end add
|
|
Scheduler.setJobVariable( @V"job_id", "job_XYZ_state", "2" ); |
In the On Finish notification action check the state of the variable. If it's "2", the job completed normally. If it's "1", it got terminated on timeout. There you can choose what to do next, send an email, create a file, run other job.
var state = Scheduler.getJobProperty( @V"job_id", "job_XYZ_state" );
|
|
if (state == '1')
{
Scheduler.mailSend(...);
Scheduler.runJob(...);
File.save("my semaphre file", "dummy value");
} |
My guts are telling me that #2 is a bad idea, I don't mean using variables and checking their state, I mean the overall design, which would be non-transparent and hard to maintain. The solution should not be based on monitoring jobs killed by the scheduler. Jobs should be deigned to gracefully abort after certain time and raise an exception or emit some notification from inside the job. The actual method is very likely business case specific and differ from case to case. Here is a simple example
|
|
var runInfo = Process.runAndWait("/bin/sleep 60", 30);
var output = runInfo.getOutput();
.... evaluate output here ... decide what do next based on the content... |
|
|
Thu Mar 02, 2023 10:10 am |
|
|
davidyeung
Joined: 13 Feb 2023 Posts: 5 Country: Hong Kong |
|
|
|
1. Also tried and not working. Would you please share the steps to setup a simple job if possible? So that I can compare which setting could cause the issue.
2. I understand what you mean. My target here is to get notified by email if there is a job running too long that blocks the execution of other jobs, and I am looking for a simple way to achieve this. Apart from the examples you provided, any other recommended ways to achieve this? By the way there is a report called "Last 30 days: Long running jobs". Is it possible to make use of this report and send out email if long running jobs is found?
|
|
Thu Mar 02, 2023 11:56 pm |
|
|
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7904
|
|
|
|
1. I've spotted an error in your example, the macro variable is @V"file", you got @V"files". I think that would fix the "[ERROR]" result. The most simple way to test it, is to create a command line job, if you are on Windows, the command line could be like
cmd /C echo "Triggered by @V"file""
or similar for *nix systems
/bin/sh /c echo "Triggered by @V"file""
Let the job run, and check in its Output capture logs (the Output tab) when that command printed
2. You can schedule that report to run periodically, and email results to yourself. If you right-click it and select Properties from the menu, in the Properties dialog use the Schedule Job... button to schedule it. If I understand correctly what you are after, consider creating a similar custom report like "Last 30 days: Long running jobs", but for one day only, and schedule it to run every day. I still think if you want to send email in the event a process is terminated on timeout, using RunAndWait or something similar might be a good option. In the alert email you can also add additional custom context or instructions in case you send it to other people, operational staff, and you wont to let them know what they need to do next.
|
|
Fri Mar 03, 2023 9:44 am |
|
|
davidyeung
Joined: 13 Feb 2023 Posts: 5 Country: Hong Kong |
|
|
|
Hi SysOp,
I have tried and it is working now after I change to @V"file".
I used @V"files" because it is on page 19 of the JAL reference pdf.
And this macro variable seems only works in JAL script but not javascript. Not sure why it is empty string when in javascript.
Thanks again for your support!
|
|
Wed Mar 08, 2023 4:12 am |
|
|
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7904
|
|
|
|
I believe it should work everywhere, not only in JAL scripts. Macro-variables are not script specific, they can also be used in command lines and other places. When you get a chance, try a simple Program type job with command line like
cmd /C echo "Triggered by @V"file""
|
|
Wed Mar 08, 2023 9:05 am |
|
|
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7904
|
|
|
|
Here is another idea for monitoring job timeout issues. In the Options Logging Settings enable log database extender to have logs sent to a database of your choice. Create a scheduled job to periodically (every 5 minutes or so) scan recently added log records with message text containing word Timeout, and if found, email them to yourself.
|
|
Wed Mar 08, 2023 10:31 am |
|
|
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7904
|
|
|
|
And just to complete the picture, if you use Splunk, Humio, DataDog or other enterprise grade log aggregator solution, you can set it up to consume 24x7's schedule.log file, and create a rule there to alert on job timeout issues.
|
|
Wed Mar 08, 2023 10:42 am |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|