 |
SoftTree Technologies
Technical Support Forums
|
|
Author |
Message |
Bill Richardson
Joined: 06 Dec 2002 Posts: 8
|
|
Job Hung In Queue as Running after Completion |
|
I'm using 3.4.11. Quite often, jobs will run and complete (I even get the e-mail notification saying the job is done); however, the queue that the job is running in still shows "Running" and the other jobs for the queue pile up behind this completed job. My questions: 1. Were there any patches in 3.4.12 or 3.4.13 that improve queue behavior in this regard? 2. Once something is "stuck" in a queue is there anyway at all to remove the "stuck" job and let the queue continue processing? E.g., can I remove/alter files in the "program files\24x7 automation\queues" folder to free the queue? Maybe a registry entry somewhere that could be deleted that would make the queue continue? So far the only alternative I've found is to kill the whole scheduler, bring it up again, and manually start the jobs that were waiting in the queue behind the bogus "running" job.
|
|
Mon Sep 22, 2003 1:42 pm |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7969
|
|
Re: Job Hung In Queue as Running after Completion |
|
You need to find and fix the cause of job hanging in the queue. Check jobs do not hang becasue they cannot communicate to your email server and so. Cehck the main job log for job warnings and errors. This may help you to indetify the cause. Program type jobs can be killed from the Windows Task Manager. I also suggest avoiding excessive number of asynchronous jobs. Unless job takes a long time to run it should no be run as an asynchronous job. If a job runs for less then 5 minutes change its type to synchronous. If needed create multiple job queues and assign different jobs to different queues. This way you can run multiple job streams simultaneously without affecting each other. As a last resort try running 24x7 with /DEBUG switch to generate additional debug records in the log file. After you reproduce the problem with the job hanging send the schedule.log to support@softtreetech.com : I'm using 3.4.11. Quite often, jobs will run : and complete (I even get the e-mail notification : saying the job is done); however, the queue that the : job is running in still shows "Running" and the : other jobs for the queue pile up behind this : completed job. : My questions: 1. Were there any patches in 3.4.12 or 3.4.13 that : improve queue behavior in this regard? : 2. Once something is "stuck" in a queue is there : anyway at all to remove the "stuck" job and let : the queue continue processing? E.g., can I remove/alter : files in the "program files\24x7 automation\queues" folder : to free the queue? Maybe a registry entry somewhere : that could be deleted that would make the queue : continue? So far the only alternative I've found : is to kill the whole scheduler, bring it up again, : and manually start the jobs that were waiting in the : queue behind the bogus "running" job.
|
|
Mon Sep 22, 2003 1:57 pm |
|
 |
Bill Richardson
Joined: 06 Dec 2002 Posts: 8
|
|
Re: Job Hung In Queue as Running after Completion |
|
As suggested in a different thread, (Job Hangs After they are finished) I'm trying installing MS Office on an affected server to see if this helps. As far as "fixing" the cause of the job hanging, I'm almost certain that what needs to be fixed is the Scheduler, or some obscure system DLL somewhere. The jobs that hang in queues are completely random. Sometimes it is a job that runs programs (.exe's), but I've also seen jobs hang after completion when the only thing the job does is execute a few JAL statements to move or copy some files. The only common element on the two servers where the problem exists is that they are multi-processor servers running Win 2000 Advanced Server running SP3. I did notice that SP4 came out recently, and am looking into installing it to see if it makes a difference. To clarify on one of the original questions, is there anything that can be done to "unstick" a queue once it freezes up? Or is killing the scheduler the only option? : You need to find and fix the cause of job hanging in the queue. Check jobs do : not hang becasue they cannot communicate to your email server and so. : Cehck the main job log for job warnings and errors. This may help you to : indetify the cause. : Program type jobs can be killed from the Windows Task Manager. : I also suggest avoiding excessive number of asynchronous jobs. Unless job : takes a long time to run it should no be run as an asynchronous job. If a : job runs for less then 5 minutes change its type to synchronous. : If needed create multiple job queues and assign different jobs to different : queues. This way you can run multiple job streams simultaneously without : affecting each other. : As a last resort try running 24x7 with /DEBUG switch to generate additional : debug records in the log file. After you reproduce the problem with the : job hanging send the schedule.log to support@softtreetech.com
|
|
Tue Sep 23, 2003 9:21 am |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7969
|
|
Re: Job Hung In Queue as Running after Completion |
|
You can only terminate jobs that run detached in other words jobs that run as seperate system processes and can be seen in the Task Manager. I want to stress again that this problem only happens to asynchronous jobs. Don't run asynchronous jobs to avoid it. I doubted SP4 would make any difference. If it does, please let us know. : As suggested in a different thread, : (Job Hangs After they are finished) : I'm trying installing MS Office on an : affected server to see if this helps. : As far as "fixing" the cause of the job hanging, I'm : almost certain that what needs to be fixed is the : Scheduler, or some obscure system DLL somewhere. The : jobs that hang in queues are completely random. : Sometimes it is a job that runs programs (.exe's), but : I've also seen jobs hang after completion when the : only thing the job does is execute a few JAL : statements to move or copy some files. The only common : element on the two servers where the problem : exists is that they are multi-processor servers : running Win 2000 Advanced Server running SP3. I did : notice that SP4 came out recently, and am looking : into installing it to see if it makes a difference. : To clarify on one of the original questions, is there : anything that can be done to "unstick" a queue once : it freezes up? Or is killing the scheduler the : only option?
|
|
Tue Sep 23, 2003 9:45 am |
|
 |
Bill Richardson
Joined: 06 Dec 2002 Posts: 8
|
|
Re: Job Hung In Queue as Running after Completion |
|
After more tests, I'm ready to post my findings on this problem. 1. There are two different kinds of "queue hangs". One is the type originally mentioned in this thread--i.e. the jobs that actually finish but stay in the queue. I haven't had much luck consistently reproducing this kind of hang. The other type is jobs that start, crash mysteriously internally with no error messages and stay stuck in the queue. More details about how to reproduce this will follow in a later post. 2. Installing Office XP Pro on an affected server made no difference. The queues still hung randomly. 3. Snycronous jobs hang in the queue just as much as asyncronous ones do. In my tests, I used only syncronous jobs and still had the queues hang. 4. Turning on Tracing made the problem much less likely to occur. In my testing the past two days, I could not get the queues to hang when tracing was turned on; however, I have seen queues hang with Tracing on in the past. Since turning on Tracing limits the problem, it's not possible to send you a trace of when the problem occurred. : You can only terminate jobs that run detached in other words jobs that run as : seperate system processes and can be seen in the Task Manager. : I want to stress again that this problem only happens to asynchronous jobs. : Don't run asynchronous jobs to avoid it. : I doubted SP4 would make any difference. If it does, please let us know.
|
|
Fri Sep 26, 2003 12:36 pm |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7969
|
|
Re: Job Hung In Queue as Running after Completion |
|
Can you provide more information about your jobs? Have you tried running these jobs detached? If yes, does it have any effect on the job hanging? : After more tests, I'm ready to post my : findings on this problem. : 1. There are two different kinds of "queue hangs". One is : the type originally mentioned in this thread--i.e. : the jobs that actually finish but stay in the queue. : I haven't had much luck consistently reproducing : this kind of hang. : The other type is jobs that start, crash mysteriously : internally with no error messages and stay stuck : in the queue. More details about how to : reproduce this will follow in a later post. : 2. Installing Office XP Pro on an affected : server made no difference. The queues still : hung randomly. : 3. Snycronous jobs hang in the queue just as : much as asyncronous ones do. In my tests, I used : only syncronous jobs and still had the queues : hang. : 4. Turning on Tracing made the problem much less : likely to occur. In my testing the past two : days, I could not get the queues to hang when : tracing was turned on; however, I have seen : queues hang with Tracing on in the past. Since : turning on Tracing limits the problem, it's not : possible to send you a trace of when the problem : occurred.
|
|
Fri Sep 26, 2003 3:12 pm |
|
 |
David Ciechanowicz
Joined: 05 Nov 2003 Posts: 6
|
|
Re: Job Hung In Queue as Running after Completion |
|
Hi, I have the same problem. Here are details for the system and my finding up to now: System: HP DL360 G3 (dual Xeon 2.8 with hyperthreading on; 3 GB RAM) OS: W2K Server SP4+ critical patches + 24x7 + NAV 8.1 24x7: 103 script jobs (JAL) using semaphores, notifiaction actions, etc. I've tried using multiple queues/one queue, synchonous/asynchronous and detached jobs. I've removed some semaphores and instead I've created jobs that use JobRun and JobGetStatus statements. I've put Exit statement at the end of every job and... nothing has changed. They still randomly get stuck. Problem is exactly as it was described by Bill. By all visible result job finishes, but in the queue monitor it's shown as running. Also JobGetStatus returns -3 for that job - it's sick :-) I've also found out other 'job flow' bug: Job doesn't start after it have found it's semaphore if the previous job got stuck (all jobs are asynchronous). Log entry shows that the job has found it trigger but it's awaiting in the queue. I've also noticed that 24x7 especially some of it's threads has quite resonable page faults number per second (10 when running jobs, 6 idle) - I don't know if that's connected. Regards, David ps.: turning trace on is not an option for me. I've got some clipper programs to run simultanouesly and turning trace on causes the 24x7 to open only one NTVDM. : Can you provide more information about your jobs? : Have you tried running these jobs detached? If yes, does it have any effect : on the job hanging?
|
|
Wed Nov 05, 2003 11:20 am |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|