Author |
Message |
seanc217
Joined: 23 May 2007 Posts: 272
|
|
Job stuck in queue |
|
What causes a job to get stuck in the queue?
I looked out on the agent that is running the job is supposed to run on, but I see no processes that are running.
The queue monitor status says "awaiting".
Thanks,
Sean
|
|
Thu Jan 24, 2008 12:47 pm |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7949
|
|
|
|
"Awaiting" status means that the job is ready to be processed, it already inserted into the queue and currently waiting for the queue to become available. The issue is somewhere else. Please take a look at the files in the named queue directory on the disk. Are there many files or just one for the pending job?
|
|
Thu Jan 24, 2008 1:17 pm |
|
 |
seanc217
Joined: 23 May 2007 Posts: 272
|
|
|
|
There is one file under this directory:
-rw-r--r-- 1 srv_etl dsadm 1407 Jan 23 16:01 1513.q
|
|
Thu Jan 24, 2008 1:21 pm |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7949
|
|
|
|
So the file is sitting there since yesterday. Please check for exceptions recorded in debug.log file if there are any related to this job or queue. There are likely going to be some, if the tracing option is currently enabled. If not enabled, there still could be recorded exceptions, but chances are not very high.
|
|
Thu Jan 24, 2008 1:30 pm |
|
 |
seanc217
Joined: 23 May 2007 Posts: 272
|
|
|
|
OK I think I know what happened.
I have one job that calls another job upon completion. That job failed and was disabled. Would that cause the queue to get stuck the next time the job runs?
Thanks.
|
|
Thu Jan 24, 2008 3:11 pm |
|
 |
seanc217
Joined: 23 May 2007 Posts: 272
|
|
|
|
It seems that it was one job getting stuck all the time.
I deleted this job and re-created it and it seems to be OK now.
Not sure what happened though.
Thanks.
|
|
Thu Jan 24, 2008 3:30 pm |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7949
|
|
|
|
Yes, that's possible in theory, but without a specific trace with a log of past errors it isn't possible to say for sure whether the cause is the failing dependent job. It could be a chain reaction or could be completely independent reasons.
|
|
Thu Jan 24, 2008 3:41 pm |
|
 |
seanc217
Joined: 23 May 2007 Posts: 272
|
|
|
|
Hi there I'm still having issues with jobs getting stuck in the queue.
If I look in the queue monitor it says awaiting.
I turned on tracing for the master and the agent.
On the master the following entries were in the debug.log file:
2008-03-14 10:31:39,534 [main] DEBUG com.softtreetech.jscheduler.JSchedulerStarter - main(...) : start
2008-03-14 10:31:39,535 [main] DEBUG com.softtreetech.jscheduler.JSchedulerStarter - startup() : start
2008-03-14 10:31:39,810 [main] DEBUG com.softtreetech.jscheduler.JSchedulerStarter - startup() : creating business objects
2008-03-14 10:31:40,188 [main] DEBUG com.softtreetech.jscheduler.JSchedulerStarter - startup() : creating UI controller
2008-03-14 10:31:40,365 [main] DEBUG com.softtreetech.jscheduler.JSchedulerStarter - startup() : initializing business objects
2008-03-14 10:31:40,370 [main] DEBUG com.softtreetech.jscheduler.business.preferences.AbstractPrefDatabase - Creating backup for preferences file preferences.xml
2008-03-14 10:31:40,373 [main] DEBUG com.softtreetech.jscheduler.business.preferences.AbstractPrefDatabase - Preferences file has been copied to preferences.bak
2008-03-14 10:53:10,959 [AWT-EventQueue-0] DEBUG com.softtreetech.jscheduler.business.A.O0oO - Number of pending jobs: 14
2008-03-14 10:55:16,528 [AWT-EventQueue-0] DEBUG com.softtreetech.jscheduler.business.A.O0oO - Number of pending jobs: 14
2008-03-14 10:55:21,400 [AWT-EventQueue-0] DEBUG com.softtreetech.jscheduler.business.A.O0oO - Number of pending jobs: 14
|
|
Fri Mar 14, 2008 11:02 am |
|
 |
seanc217
Joined: 23 May 2007 Posts: 272
|
|
|
|
Ok this is odd.
I clicked on hold job in the queue manager.
I then clicked on release.
Once I did this, the job ran and was removed from the queue.
Any ideas?
|
|
Fri Mar 14, 2008 11:05 am |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7949
|
|
|
|
This is odd indeed. Can you post contents of the debug.log after hold and release?
|
|
Fri Mar 14, 2008 11:21 am |
|
 |
seanc217
Joined: 23 May 2007 Posts: 272
|
|
|
|
OK some errors did occur.
Here's the last 100 entries from the log...
2008-03-14 11:11:01,029 [AWT-EventQueue-0] DEBUG com.softtreetech.jscheduler.business.A.O0oO - Number of pending jobs: 14
2008-03-14 11:14:05,987 [AWT-EventQueue-0] DEBUG com.softtreetech.jscheduler.business.A.O0oO - Number of pending jobs: 14
2008-03-14 11:20:38,177 [AWT-EventQueue-0] DEBUG com.softtreetech.jscheduler.ui.ooOO.G.o0OO - show() : start
2008-03-14 11:20:38,608 [AWT-EventQueue-0] DEBUG com.softtreetech.jscheduler.ui.ooOO.G.o0OO - show() : end
2008-03-14 11:24:07,144 [AWT-EventQueue-0] DEBUG com.softtreetech.jscheduler.ui.ooOO.G.o0OO - onFinish() : start
2008-03-14 11:24:07,159 [AWT-EventQueue-0] DEBUG com.softtreetech.jscheduler.ui.ooOO.G.o0OO - onFinish() : end
2008-03-14 11:24:07,160 [AWT-EventQueue-0] DEBUG com.softtreetech.jscheduler.db.JobDbImpl - updateJob(JobProperties) : start
2008-03-14 11:24:07,160 [AWT-EventQueue-0] DEBUG com.softtreetech.jscheduler.db.drivers.file.FileJobDbStorage - update(...) : start
2008-03-14 11:24:07,161 [AWT-EventQueue-0] DEBUG com.softtreetech.jscheduler.db.drivers.file.FileJobDbStorage - update(...) : end
2008-03-14 11:24:07,161 [AWT-EventQueue-0] DEBUG com.softtreetech.jscheduler.db.JobDbImpl - updateJob(JobProperties) : end
2008-03-14 11:27:11,951 [AWT-EventQueue-0] DEBUG com.softtreetech.jscheduler.business.queue.JobQueue - QUEUE_UNLOCKED
2008-03-14 11:27:12,242 [AWT-EventQueue-0] DEBUG com.softtreetech.jscheduler.business.A.O0oO - Number of pending jobs: 14
2008-03-14 11:27:12,494 [Job #24 - 01_check_eisall] DEBUG com.softtreetech.jscheduler.business.runner.RemoteJobRunner - runJob
com.softtreetech.jscheduler.common.SchedException
at com.softtreetech.jscheduler.business.agent.remote.RemoteAgentImpl.executeJob(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:324)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:261)
at sun.rmi.transport.Transport$1.run(Transport.java:148)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:144)
at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:460)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:701)
at java.lang.Thread.run(Thread.java:534)
at sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:247)
at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:223)
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:133)
at com.softtreetech.jscheduler.business.agent.remote.RemoteAgentImpl_Stub.executeJob(Unknown Source)
at com.softtreetech.jscheduler.business.runner.RemoteJobRunner.runJob(Unknown Source)
at com.softtreetech.jscheduler.business.runner.AbstractJobRunner.do(Unknown Source)
at com.softtreetech.jscheduler.business.runner.AbstractJobRunner.Ã00000(Unknown Source)
at com.softtreetech.jscheduler.business.runner.AbstractJobRunner.execute(Unknown Source)
at com.softtreetech.jscheduler.business.runner.JobExecutorImpl.execute(Unknown Source)
at com.softtreetech.jscheduler.business.runner.JobExecutorImpl$1.run(Unknown Source)
at java.lang.Thread.run(Thread.java:534)
2008-03-14 11:27:12,497 [Job #24 - 01_check_eisall] DEBUG com.softtreetech.jscheduler.business.queue.JobQueue - QUEUE_UNLOCKED
2008-03-14 11:27:12,739 [Job #24 - 01_check_eisall] ERROR com.softtreetech.jscheduler.business.runner.JobExecutorImpl - Job errors: Remote job failed. Exit code: -1
2008-03-14 11:27:12,804 [AWT-EventQueue-0] DEBUG com.softtreetech.jscheduler.business.A.O0oO - Number of pending jobs: 14
2008-03-14 11:27:45,476 [AWT-EventQueue-0] DEBUG com.softtreetech.jscheduler.business.queue.JobQueue - QUEUE_UNLOCKED
2008-03-14 11:27:45,551 [AWT-EventQueue-0] DEBUG com.softtreetech.jscheduler.business.A.O0oO - Number of pending jobs: 14
2008-03-14 11:27:45,742 [Job #24 - 01_check_eisall] DEBUG com.softtreetech.jscheduler.business.runner.RemoteJobRunner - runJob
com.softtreetech.jscheduler.common.SchedException
at com.softtreetech.jscheduler.business.agent.remote.RemoteAgentImpl.executeJob(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:324)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:261)
at sun.rmi.transport.Transport$1.run(Transport.java:148)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:144)
at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:460)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:701)
at java.lang.Thread.run(Thread.java:534)
at sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:247)
at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:223)
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:133)
at com.softtreetech.jscheduler.business.agent.remote.RemoteAgentImpl_Stub.executeJob(Unknown Source)
at com.softtreetech.jscheduler.business.runner.RemoteJobRunner.runJob(Unknown Source)
at com.softtreetech.jscheduler.business.runner.AbstractJobRunner.do(Unknown Source)
at com.softtreetech.jscheduler.business.runner.AbstractJobRunner.Ã00000(Unknown Source)
at com.softtreetech.jscheduler.business.runner.AbstractJobRunner.execute(Unknown Source)
at com.softtreetech.jscheduler.business.runner.JobExecutorImpl.execute(Unknown Source)
at com.softtreetech.jscheduler.business.runner.JobExecutorImpl$1.run(Unknown Source)
at java.lang.Thread.run(Thread.java:534)
2008-03-14 11:27:45,745 [Job #24 - 01_check_eisall] DEBUG com.softtreetech.jscheduler.business.queue.JobQueue - QUEUE_UNLOCKED
2008-03-14 11:27:45,956 [Job #24 - 01_check_eisall] ERROR com.softtreetech.jscheduler.business.runner.JobExecutorImpl - Job errors: Remote job failed. Exit code: -1
2008-03-14 11:27:46,019 [AWT-EventQueue-0] DEBUG com.softtreetech.jscheduler.business.A.O0oO - Number of pending jobs: 14
2008-03-14 11:30:00,009 [Thread-63] DEBUG com.softtreetech.jscheduler.business.queue.JobQueue - QUEUE_UNLOCKED
2008-03-14 11:30:00,300 [AWT-EventQueue-0] DEBUG com.softtreetech.jscheduler.business.A.O0oO - Number of pending jobs: 14
2008-03-14 11:30:00,548 [Job #49 - 01_check_7263] DEBUG com.softtreetech.jscheduler.business.runner.RemoteJobRunner - runJob
com.softtreetech.jscheduler.common.SchedException
at com.softtreetech.jscheduler.business.agent.remote.RemoteAgentImpl.executeJob(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:324)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:261)
at sun.rmi.transport.Transport$1.run(Transport.java:148)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:144)
at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:460)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:701)
at java.lang.Thread.run(Thread.java:534)
at sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:247)
at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:223)
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:133)
at com.softtreetech.jscheduler.business.agent.remote.RemoteAgentImpl_Stub.executeJob(Unknown Source)
at com.softtreetech.jscheduler.business.runner.RemoteJobRunner.runJob(Unknown Source)
at com.softtreetech.jscheduler.business.runner.AbstractJobRunner.do(Unknown Source)
at com.softtreetech.jscheduler.business.runner.AbstractJobRunner.Ã00000(Unknown Source)
at com.softtreetech.jscheduler.business.runner.AbstractJobRunner.execute(Unknown Source)
at com.softtreetech.jscheduler.business.runner.JobExecutorImpl.execute(Unknown Source)
at com.softtreetech.jscheduler.business.runner.JobExecutorImpl$1.run(Unknown Source)
at java.lang.Thread.run(Thread.java:534)
2008-03-14 11:30:00,553 [Job #49 - 01_check_7263] DEBUG com.softtreetech.jscheduler.business.queue.JobQueue - QUEUE_UNLOCKED
2008-03-14 11:30:00,784 [Job #49 - 01_check_7263] ERROR com.softtreetech.jscheduler.business.runner.JobExecutorImpl - Job errors: Remote job failed. Exit code: -1
2008-03-14 11:30:01,226 [AWT-EventQueue-0] DEBUG com.softtreetech.jscheduler.business.A.O0oO - Number of pending jobs: 14
|
|
Fri Mar 14, 2008 11:31 am |
|
 |
seanc217
Joined: 23 May 2007 Posts: 272
|
|
|
|
Is there a process or file that I can lsof to see if it is being held open?
Thanks.
|
|
Fri Mar 14, 2008 11:51 am |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7949
|
|
|
|
I'm not sure if that is a file locking issue. It could be just sitting and waiting for some response from the agent and never getting that response. Please give us some time to analyze the trace.
|
|
Fri Mar 14, 2008 12:25 pm |
|
 |
seanc217
Joined: 23 May 2007 Posts: 272
|
|
|
|
OK thanks for the help.
|
|
Fri Mar 14, 2008 12:49 pm |
|
 |
seanc217
Joined: 23 May 2007 Posts: 272
|
|
|
|
I think I know what part of the problem is...
It appears that I had multiple instances of the scheduler running one in the background one in the foregroud.
I could see this messing up the queues.
Thanks,
Sean
|
|
Fri Mar 14, 2008 3:28 pm |
|
 |
|