Author |
Message |
dno
Joined: 27 May 2009 Posts: 12 Country: United States |
|
Window BAT RunAndWait Jobs Ghosting |
|
I've got the weirdest problem ever. Our jobs has been running fine for a long time now. Then one of the jobs ran later than normal and started over lapping other jobs. This was fixed. Now it seems like the scheduled or triggered job (not the ones adjusted) runs but only for less than a minute. The BAT executable captures the screen to a log and that seems to be executing. However, nothing is processed and no errors detected. I logon to the server and run the BAT directly and it processes just fine.
Here is part of the trace:
99: RUNANDWAIT
99: Executing RUNANDWAIT("\\MyServer\COMMANDfolder\job.BAT >\\MyServer\LOG\LOG.TXT", "\\MyServer\COMMANDfolder\", "14400", "5816")
Return "5388"
101: PROCESSGETEXITCODE
101: Executing PROCESSGETEXITCODE("0")
Return "0"
Help please. I'm currently running all the nightly jobs manually and working days. Our version is 3.4.26 and I'm practically brain dead. Any help would be greatly appreciated.
|
|
Sun Sep 06, 2009 8:51 am |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7952
|
|
|
|
This might be some security or environment related issue, which could be somehow affected by the other job with the impact on the scheduler process and everything ever since started by the scheduler (child processes). That's why when you run the batch manually, it works ok, because of different environment settings fed to the process..
I suggest trying to restart the scheduler. This should restore the original environment settings and correct the issue. Also, check whether the scheduler is run under correct user account (valid account, password no expired, etc…)
|
|
Sun Sep 06, 2009 9:35 am |
|
 |
dno
Joined: 27 May 2009 Posts: 12 Country: United States |
|
Scheduler Permissions |
|
I've logged on to the server console with the admin account being used for the scheduler. Stopped the scheduler and rebooted the server. After verifying the server is running the scheduler properly, I tried to run the job. The same thing happened with the exe from the batch job running for a moment and then stopping. As if I still do not have the correct permissions or something like that. Do I have to totally shutdown the server to clear cache or some memory?
|
|
Mon Sep 07, 2009 7:38 am |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7952
|
|
|
|
Does the job have a timeout property set? Is that set to run asynchronous or synchronous.
|
|
Mon Sep 07, 2009 9:02 am |
|
 |
dno
Joined: 27 May 2009 Posts: 12 Country: United States |
|
|
|
The job is running JAL as follows:
Asyncronious = NO -- no other job running at this time though
Detached = YES
Queue = one of it's own
No timeout property set for the job.
command RunAndWait seconds is set to 7200
Thank you for helping me.
|
|
Mon Sep 07, 2009 6:35 pm |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7952
|
|
|
|
Please enable the tracing option (Tools/Options menu; Log tab, Trace Enabled) and restart the scheduler.
Let the job run. After job run, check the updated log files. If you got 24x7 version 3.6, please check debug.log file. If you got an earlier version, please check script.log file. Let us know what you see there.
|
|
Mon Sep 07, 2009 11:08 pm |
|
 |
dno
Joined: 27 May 2009 Posts: 12 Country: United States |
|
|
|
In the performance folder for this job, this is what it shows when trying to execute the batch file statement:
107: RUNANDWAIT
107: Executing RUNANDWAIT("\\MyServerName\COMMANDFolder\DAILYjob.BAT", "\\MyServerName\COMMANDFolder", "7200", "0")
Return "3508"
108: PROCESSGETEXITCODE
108: Executing PROCESSGETEXITCODE("0")
Return "0"
Would you happen to know what the return code 3508 means?
The jobs don't run over the weekend and on holidays. The next run will be 7/9 12:30 am HST. Thank you so much
|
|
Tue Sep 08, 2009 1:50 am |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7952
|
|
|
|
Well, 3308 is the Windows system process ID of the created process. The same thing you would see in the Windows Task Manager in PID column on details tab. This is not an error. I hope to see something more interesting in debug.log.
Couple of other options to try, changing the command line to one of the following.
 |
 |
cmd /C \\MyServerName\COMMANDFolder\DAILYjob.BAT |
or
 |
 |
start /I /WAIT \\MyServerName\COMMANDFolder\DAILYjob.BAT |
Turn on (or if already on) turn off, tracing option in 24x7 (Tools -> Options -> Log -> Trace enabled.
And looking at your JAL code, you can simply convert this job from JAL to a regular program type job (batch/program/document type)
|
|
Tue Sep 08, 2009 12:35 pm |
|
 |
dno
Joined: 27 May 2009 Posts: 12 Country: United States |
|
|
|
We did a "Brain Drain" on the server to clear any cache as well as clear all IE browser temp and cookies. Problem still existed. So I stopped the 24x7 scheduler service. Opened the 24x7 master and ran the job from there. No problems. Works like a charm. I exited the master and restarted the windows service. Ran the job from a remote scheduler session and the problem re-occurred.
I did turn off the tracing option log. Exited and went back in to the master to turn on the tracing log while the windows service was stopped. I can't seem to find a file called debug.log. However there are files in the Performance folder under the job ID number. Are those the debug log file you are anticipating?
So the Master works but not if it is running as a service calling the BAT and running the exe. It seems to be something about that service. I even removed the service and re-established it.
Any help would be more than greatly appreciated.
|
|
Wed Sep 09, 2009 1:00 am |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7952
|
|
|
|
I bet, this is either a security/environment settings related issue or the process you are running requires access to the desktop.
Let's start with the second theory first. Here is a utility http://www.softtreetech.com/24x7/archive/43.htm that can be used to start a process from a service (or non service ) and bind it to the interactive user desktop. Download and add it to the command line as in the example. Try running the job from service.
In case the above doesn't help, check the first theory. For beginning make sure the service is running under your user account (not a LocalSystem or other account confined to access local-only resources. Try running the job.
If the above doesn't help, use Windows Administrative Tools -> Local Security Applet to enable auditing for everything (audit object access, privilege use, etc...). Restart the system to make audit changes effective. Try running the job. Check Windows Security Event Log for error and warning events indicating access denial events.
|
|
Wed Sep 09, 2009 1:47 am |
|
 |
dno
Joined: 27 May 2009 Posts: 12 Country: United States |
|
|
|
I see what you mean... We use local mapped network drives to the user desktop logon session. I think this is what we have lost (for some reason). I'm going to try the NET USE command to generate the mapping on the fly within the session. If this doesn't work I'll then use the utility to bind the job to the user desktop session. I'm assuming that if I do this the user does not need to be logged on and that only has to be created on the server.
Thank you so much for helping. I'll keep you posted.
|
|
Wed Sep 09, 2009 2:39 am |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7952
|
|
|
|
Just in case... Services and any processes started from services cannot use existing drive mapping. They should refer to networked files using UNC names. Please also ensure you pick some account for the service that has sufficient privileges for the network access and at the same time can start automatically after system restart and before the first interactive user logon, for example, NetworkService or domain admin account or something like that..
|
|
Wed Sep 09, 2009 6:55 am |
|
 |
dno
Joined: 27 May 2009 Posts: 12 Country: United States |
|
|
|
That was the problem. The batch was using the drive mapping for the service account which worked intermittently. Within the 24x7 jobs the UNC names were being used and I changed the batch to have "NET USE" to define a mapped drive. This seems to be holding up for the moment.
Thank you so much! Now I can really sleep at nights (^.^)
|
|
Thu Sep 10, 2009 3:31 am |
|
 |
susanspy
Joined: 15 Sep 2009 Posts: 1
|
|
|
|
Hello all,
I too have question, Does the job have a timeout property set? Is that set to run asynchronous or synchroous, but i guess the answer is on, any updates on the same.
Thanks a lot.
|
|
Mon Oct 05, 2009 2:53 pm |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7952
|
|
|
|
Hello. I'm sorry I don't understand your question. How does that relate to the previous messages? Please describe your issue in more details.
|
|
Mon Oct 05, 2009 5:35 pm |
|
 |
|