SoftTree Technologies SoftTree Technologies
Technical Support Forums
RegisterSearchFAQMemberlistUsergroupsLog in
Job Hung In Queue as Running after Completion

 
Reply to topic    SoftTree Technologies Forum Index » 24x7 Scheduler, Event Server, Automation Suite View previous topic
View next topic
Job Hung In Queue as Running after Completion
Author Message
Bill Richardson



Joined: 06 Dec 2002
Posts: 8

Post Job Hung In Queue as Running after Completion Reply with quote

I'm using 3.4.11. Quite often, jobs will run
and complete (I even get the e-mail notification
saying the job is done); however, the queue that the
job is running in still shows "Running" and the
other jobs for the queue pile up behind this
completed job.

My questions:

1. Were there any patches in 3.4.12 or 3.4.13 that
improve queue behavior in this regard?
2. Once something is "stuck" in a queue is there
anyway at all to remove the "stuck" job and let
the queue continue processing? E.g., can I remove/alter
files in the "program files\24x7 automation\queues" folder
to free the queue? Maybe a registry entry somewhere
that could be deleted that would make the queue
continue? So far the only alternative I've found
is to kill the whole scheduler, bring it up again,
and manually start the jobs that were waiting in the
queue behind the bogus "running" job.

Mon Sep 22, 2003 1:42 pm View user's profile Send private message
SysOp
Site Admin


Joined: 26 Nov 2006
Posts: 7969

Post Re: Job Hung In Queue as Running after Completion Reply with quote

You need to find and fix the cause of job hanging in the queue. Check jobs do not hang becasue they cannot communicate to your email server and so. Cehck the main job log for job warnings and errors. This may help you to indetify the cause.

Program type jobs can be killed from the Windows Task Manager.

I also suggest avoiding excessive number of asynchronous jobs. Unless job takes a long time to run it should no be run as an asynchronous job. If a job runs for less then 5 minutes change its type to synchronous.
If needed create multiple job queues and assign different jobs to different queues. This way you can run multiple job streams simultaneously without affecting each other.

As a last resort try running 24x7 with /DEBUG switch to generate additional debug records in the log file. After you reproduce the problem with the job hanging send the schedule.log to support@softtreetech.com

: I'm using 3.4.11. Quite often, jobs will run
: and complete (I even get the e-mail notification
: saying the job is done); however, the queue that the
: job is running in still shows "Running" and the
: other jobs for the queue pile up behind this
: completed job.

: My questions: 1. Were there any patches in 3.4.12 or 3.4.13 that
: improve queue behavior in this regard?
: 2. Once something is "stuck" in a queue is there
: anyway at all to remove the "stuck" job and let
: the queue continue processing? E.g., can I remove/alter
: files in the "program files\24x7 automation\queues" folder
: to free the queue? Maybe a registry entry somewhere
: that could be deleted that would make the queue
: continue? So far the only alternative I've found
: is to kill the whole scheduler, bring it up again,
: and manually start the jobs that were waiting in the
: queue behind the bogus "running" job.

Mon Sep 22, 2003 1:57 pm View user's profile Send private message
Bill Richardson



Joined: 06 Dec 2002
Posts: 8

Post Re: Job Hung In Queue as Running after Completion Reply with quote

As suggested in a different thread,
(Job Hangs After they are finished)
I'm trying installing MS Office on an
affected server to see if this helps.

As far as "fixing" the cause of the job hanging, I'm
almost certain that what needs to be fixed is the
Scheduler, or some obscure system DLL somewhere. The
jobs that hang in queues are completely random.
Sometimes it is a job that runs programs (.exe's), but
I've also seen jobs hang after completion when the
only thing the job does is execute a few JAL
statements to move or copy some files. The only common
element on the two servers where the problem
exists is that they are multi-processor servers
running Win 2000 Advanced Server running SP3. I did
notice that SP4 came out recently, and am looking
into installing it to see if it makes a difference.

To clarify on one of the original questions, is there
anything that can be done to "unstick" a queue once
it freezes up? Or is killing the scheduler the
only option?

: You need to find and fix the cause of job hanging in the queue. Check jobs do
: not hang becasue they cannot communicate to your email server and so.
: Cehck the main job log for job warnings and errors. This may help you to
: indetify the cause.

: Program type jobs can be killed from the Windows Task Manager.

: I also suggest avoiding excessive number of asynchronous jobs. Unless job
: takes a long time to run it should no be run as an asynchronous job. If a
: job runs for less then 5 minutes change its type to synchronous.
: If needed create multiple job queues and assign different jobs to different
: queues. This way you can run multiple job streams simultaneously without
: affecting each other.

: As a last resort try running 24x7 with /DEBUG switch to generate additional
: debug records in the log file. After you reproduce the problem with the
: job hanging send the schedule.log to support@softtreetech.com

Tue Sep 23, 2003 9:21 am View user's profile Send private message
SysOp
Site Admin


Joined: 26 Nov 2006
Posts: 7969

Post Re: Job Hung In Queue as Running after Completion Reply with quote

You can only terminate jobs that run detached in other words jobs that run as seperate system processes and can be seen in the Task Manager.
I want to stress again that this problem only happens to asynchronous jobs.
Don't run asynchronous jobs to avoid it.
I doubted SP4 would make any difference. If it does, please let us know.

: As suggested in a different thread,
: (Job Hangs After they are finished)
: I'm trying installing MS Office on an
: affected server to see if this helps.

: As far as "fixing" the cause of the job hanging, I'm
: almost certain that what needs to be fixed is the
: Scheduler, or some obscure system DLL somewhere. The
: jobs that hang in queues are completely random.
: Sometimes it is a job that runs programs (.exe's), but
: I've also seen jobs hang after completion when the
: only thing the job does is execute a few JAL
: statements to move or copy some files. The only common
: element on the two servers where the problem
: exists is that they are multi-processor servers
: running Win 2000 Advanced Server running SP3. I did
: notice that SP4 came out recently, and am looking
: into installing it to see if it makes a difference.

: To clarify on one of the original questions, is there
: anything that can be done to "unstick" a queue once
: it freezes up? Or is killing the scheduler the
: only option?

Tue Sep 23, 2003 9:45 am View user's profile Send private message
Bill Richardson



Joined: 06 Dec 2002
Posts: 8

Post Re: Job Hung In Queue as Running after Completion Reply with quote

After more tests, I'm ready to post my
findings on this problem.

1. There are two different kinds of "queue hangs". One is
the type originally mentioned in this thread--i.e.
the jobs that actually finish but stay in the queue.
I haven't had much luck consistently reproducing
this kind of hang.
The other type is jobs that start, crash mysteriously
internally with no error messages and stay stuck
in the queue. More details about how to
reproduce this will follow in a later post.
2. Installing Office XP Pro on an affected
server made no difference. The queues still
hung randomly.
3. Snycronous jobs hang in the queue just as
much as asyncronous ones do. In my tests, I used
only syncronous jobs and still had the queues
hang.
4. Turning on Tracing made the problem much less
likely to occur. In my testing the past two
days, I could not get the queues to hang when
tracing was turned on; however, I have seen
queues hang with Tracing on in the past. Since
turning on Tracing limits the problem, it's not
possible to send you a trace of when the problem
occurred.

: You can only terminate jobs that run detached in other words jobs that run as
: seperate system processes and can be seen in the Task Manager.
: I want to stress again that this problem only happens to asynchronous jobs.
: Don't run asynchronous jobs to avoid it.
: I doubted SP4 would make any difference. If it does, please let us know.

Fri Sep 26, 2003 12:36 pm View user's profile Send private message
SysOp
Site Admin


Joined: 26 Nov 2006
Posts: 7969

Post Re: Job Hung In Queue as Running after Completion Reply with quote

Can you provide more information about your jobs?
Have you tried running these jobs detached? If yes, does it have any effect on the job hanging?

: After more tests, I'm ready to post my
: findings on this problem.

: 1. There are two different kinds of "queue hangs". One is
: the type originally mentioned in this thread--i.e.
: the jobs that actually finish but stay in the queue.
: I haven't had much luck consistently reproducing
: this kind of hang.
: The other type is jobs that start, crash mysteriously
: internally with no error messages and stay stuck
: in the queue. More details about how to
: reproduce this will follow in a later post.
: 2. Installing Office XP Pro on an affected
: server made no difference. The queues still
: hung randomly.
: 3. Snycronous jobs hang in the queue just as
: much as asyncronous ones do. In my tests, I used
: only syncronous jobs and still had the queues
: hang.
: 4. Turning on Tracing made the problem much less
: likely to occur. In my testing the past two
: days, I could not get the queues to hang when
: tracing was turned on; however, I have seen
: queues hang with Tracing on in the past. Since
: turning on Tracing limits the problem, it's not
: possible to send you a trace of when the problem
: occurred.

Fri Sep 26, 2003 3:12 pm View user's profile Send private message
David Ciechanowicz



Joined: 05 Nov 2003
Posts: 6

Post Re: Job Hung In Queue as Running after Completion Reply with quote

Hi,

I have the same problem. Here are details for the system and my finding up to now:

System:
HP DL360 G3 (dual Xeon 2.8 with hyperthreading on; 3 GB RAM)
OS: W2K Server SP4+ critical patches + 24x7 + NAV 8.1

24x7:
103 script jobs (JAL) using semaphores, notifiaction actions, etc.

I've tried using multiple queues/one queue, synchonous/asynchronous and detached jobs.
I've removed some semaphores and instead I've created jobs that use JobRun and JobGetStatus
statements. I've put Exit statement at the end of every job and... nothing has changed.
They still randomly get stuck. Problem is exactly as it was described by Bill.
By all visible result job finishes, but in the queue monitor it's shown as running.
Also JobGetStatus returns -3 for that job - it's sick :-)
I've also found out other 'job flow' bug:
Job doesn't start after it have found it's semaphore if the previous job got stuck
(all jobs are asynchronous). Log entry shows that the job has found it trigger but it's
awaiting in the queue.
I've also noticed that 24x7 especially some of it's threads has quite resonable page faults
number per second (10 when running jobs, 6 idle) - I don't know if that's connected.

Regards,
David

ps.: turning trace on is not an option for me. I've got some clipper programs to run simultanouesly
and turning trace on causes the 24x7 to open only one NTVDM.

: Can you provide more information about your jobs?
: Have you tried running these jobs detached? If yes, does it have any effect
: on the job hanging?

Wed Nov 05, 2003 11:20 am View user's profile Send private message
Display posts from previous:    
Reply to topic    SoftTree Technologies Forum Index » 24x7 Scheduler, Event Server, Automation Suite All times are GMT - 4 Hours
Page 1 of 1

 
Jump to: 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


 

 

Powered by phpBB © 2001, 2005 phpBB Group
Design by Freestyle XL / Flowers Online.