 |
SoftTree Technologies
Technical Support Forums
|
|
Author |
Message |
ans
Joined: 17 Jul 2007 Posts: 4 Country: Switzerland |
|
No job finished message from remote agent |
|
Hi
I’m trying to execute the following job on a remote agent:
Dim processID, number
RunAndWait "xyz", "",0, processID
The job is set-up to start another job when finished.
The job does fine, but I’m not getting any ‘remote job finished’ message back from the agent if the job takes more than about an hour to run. In this case, the depending job is not started.
thanks
|
|
Tue Jul 17, 2007 1:57 pm |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7955
|
|
|
|
Is that correct to say the job does return ‘remote job finished’ message back from the agent if the 'xyz' process takes less than 30 minutes to run and does't return this message if takes longer?
Did you check the schedule.log file on the agent? Are there any additional messages related to this job processing?
|
|
Tue Jul 17, 2007 2:22 pm |
|
 |
ans
Joined: 17 Jul 2007 Posts: 4 Country: Switzerland |
|
|
|
... Is that correct to say the job does return ‘remote job finished’ message back from the agent if the 'xyz' process takes less than 30 minutes to run and does't return this message if takes longer?
Yes, but I would say it's about 60 minutes rather then 30.
... Did you check the schedule.log file on the agent? Are there any additional messages related to this job processing?
All the messages on the agent seem to be correct:
remote job started
job started
job finished
remote job finished
there are no other messages.
The messages in 24x7 itself are:
job started
Agent xyz contacted
the job stays in the queue as 'running'.
|
|
Tue Jul 17, 2007 3:15 pm |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7955
|
|
|
|
Hmm... 60 minutes could be the socket timeout. What does the scheduler show in the log if the remote job takes over 60 minutes to run? If there is nothing meaningful in the log, please enable the tracing options in the Tools/Options menu; Log page; and let the job run for over 60 minutes. Check what you get in the agent.log and script.log files. If you want to test a dummy process taking a long time to run you can run the Wait utility with appropriate parameters. The utility is available here http://www.softtreetech.com/24x7/archive/58.htm
|
|
Tue Jul 17, 2007 3:27 pm |
|
 |
ans
Joined: 17 Jul 2007 Posts: 4 Country: Switzerland |
|
|
|
These are the two logs I found on the agent:
------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------
18.07.2007 10:18:52 0 202 0 A Remote job started.
18.07.2007 10:18:52 0 202 0 A Job started.
18.07.2007 11:18:53 0 202 0 A Job finished.
18.07.2007 11:18:53 0 202 0 A Remote job finished
------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------
Job No: 202 Job Name: A Process ID: 4560
Creation Time: 18-Jul-2007 8:18:52:0009 Exit Time: 18-Jul-2007 9:18:53:0009
Duration: 3601 seconds
Kernel Time: 0:00:00:0000 User Time: 0:00:00:0000
Exit Code: 0
Free Resource: Before: 100.000% After: 100.000%
Memory Load: Before: 33% After: 33%
Available Physical Memory: Before: 2147483647 (2'097'152 Kb) After: 2147483647 (2'097'152 Kb)
Available Virtual Memory: Before: 1938452480 (1'893'020 Kb) After: 1941467136 (1'895'964 Kb)
Available Page File Size: Before: 4294967295 (4'194'304 Kb) After: 4294967295 (4'194'304 Kb)
|
|
Wed Jul 18, 2007 7:59 am |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7955
|
|
|
|
Sorry, I had to be more specific. We are interested in seeing "agent.log" from the scheduler side. This file is created in 24x7 home directory if tracing is enabled in the remote agent profile properties and it contains internal communication protocol commands between the scheduler and the agent. "script.log" should be on the agent side. This file is created in 24x7 home directory if global tracing options are enabled in agent settings (Tools/Options menu; Log page).
|
|
Wed Jul 18, 2007 8:34 am |
|
 |
ans
Joined: 17 Jul 2007 Posts: 4 Country: Switzerland |
|
|
|
Below the logs. Once with wait 1 sec and one wait 3601 secs:
1 Sec
Agent: script.log
JAL
**** 18.07.2007 15:44:56 ****
JAL 1: DIM
JAL 1: Executing DIM("PROCESSID", "NUMBER")
JAL 5: RUNANDWAIT
JAL 5: Executing RUNANDWAIT("C:\TEMP\WAIT.EXE 1", "", "0", "0")
JAL Return "3740"
24x7: agent.log
DPB01 Connection(05901660): ConnectToServer : application(1096), location(TKF-B3), driver(WinSock). ... (00000000)
DPB01 Connection(05901660): ConnectToServer : application(1096), location(TKF-B3), driver(WinSock). SUCCEEDED (00000000)
DPB43 SessionId (00C0F880): Create Object : className(n_gateway), instanceId(FFFFFFFF). ... (00000000)
DPB43 SessionId (00C0F880): Create Object : className(n_gateway), instanceId(00000038). SUCCEEDED (00000000)
DPB48 SessionId (00C0F880): Invoke : instanceId(00000038), uf_executejob(IO), numArgs(1) SYNCHRONOUS
DPB44 SessionId (00C0F880): Destroy Object: instanceId(00000038)
DPB02 Connection(05901660): DisconnectServer: ... (00000000)
DPB02 Connection(05901660): DisconnectServer: SUCCEEDED (00000000)
3601 Secs
Agent: script.log
empty
24x7: agent.log
DPB01 Connection(05901648): ConnectToServer : application(1096), location(TKF-B3), driver(WinSock). ... (00000000)
DPB01 Connection(05901648): ConnectToServer : application(1096), location(TKF-B3), driver(WinSock). SUCCEEDED (00000000)
DPB43 SessionId (00C0F880): Create Object : className(n_gateway), instanceId(FFFFFFFF). ... (00000000)
DPB43 SessionId (00C0F880): Create Object : className(n_gateway), instanceId(00000038). SUCCEEDED (00000000)
DPB48 SessionId (00C0F880): Invoke : instanceId(00000038), uf_executejob(IO), numArgs(1) SYNCHRONOUS
SMI02 Connection(05901818): ERROR OCCURRED: Distributed Communications Error: Recv call failed with the error EOF, (3A0) (8004200D)
SMI02 Connection(05901818): ERROR OCCURRED: The request caused an abnormal termination, the connection has been closed. (80042008)
SMI02 Connection(05901818): ERROR OCCURRED: The connection to the server has been lost (8004206E)
|
|
Wed Jul 18, 2007 11:01 am |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|