I would like to suggest a worker-runner method for this situation. Create another job, let call it a "runner", let's call your current job a "worker". Set the schedule of the runner job to much your worker job and then set the schedule of the worker to none [no schedule] The runner job could be a simple program type job with a command line like "24x7 /JOB worker" and it should be set to run synchronous and have a timeout of let's say 30 minutes. This way if the worker job hangs the runner will kill it 30 minutes after job start. You can of course configure email notification action in the runner job so it can email you whenever it has to terminate the worker job. : Hi, : I have been having a recurring problem. I have a 24x7 job that runs as a : remote job. The master scheduler (A) launches this job to run in server B. : I have a monitor that checks if server B is alive. Every now and then (not : particular conditions identified so far) the job freezes while running on : B, therefore holding the process in A (I run it as detached synchronous : job). The 24x7 in B doesn't respond anymore, and the job that monitors if : it is still alive, hangs too. : So, there are two problems we have: (1) The job that runs and hangs (blocking : the 24x7) : (2) The monitor job for B hangs in the queue (this monitor runs every 30 min, : so it starts running about 10 min. after the other job has been running) : Any ideas? : Is there a way to monitor the queues so we know when a job has been in the : queue for more than X minutes? : Thanks, : -Mauricio
|