 |
SoftTree Technologies
Technical Support Forums
|
|
Author |
Message |
ddw
Joined: 22 Oct 2014 Posts: 10 Country: United States |
|
job output not being saved |
|
I'm having an issue where captured output is not being saved if a job exits abnormally, or with a non-zero exit code.
We're running a master server on linux, with remote agents running the actual jobs on Solaris.
Even with successful jobs, it seems to not write the captured output to a file until the job exits. Is this a configuration thing? Is there an option to turn off buffering or something else I can try?
Version: 5.1.403
System: Linux; JVM 1.6.0_45
|
|
Thu Dec 04, 2014 5:07 pm |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7948
|
|
|
|
The output reading from process and writing to an output file is buffered on the agent side. When the job completes, the agent transfers the output file to the master scheduler. It's not the ending of the process but the internal buffer filling flushes the data to the output file. I think the buffer size is operation system specific and I don't know if it can be adjusted. Yet the crashing of an external process should have any impact.
Can you please provide a bit more details on the job in question? Specifically, what the job is doing and what happens when it crashes?
|
|
Thu Dec 04, 2014 5:24 pm |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7948
|
|
|
|
Also, if you enable tracing in the agent's settings, in case of a job crash, do you see its output written to debug.log file (it's in the agent's home directory)
|
|
Thu Dec 04, 2014 5:26 pm |
|
 |
ddw
Joined: 22 Oct 2014 Posts: 10 Country: United States |
|
|
|
That's part of the problem, I couldn't tell what it was doing. I first noticed the issue with a production job that suddenly started to hang inexplicably a few weeks ago. I went to check the output files for information and nothing was there. It occurred to me it might be buffering, so I tried killing the job (just a kill -9 on the job's pid rather than going through the scheduler interface) I was hoping the agent would write out the captured output when the job was killed, but no such luck.
In that case the issue with the job was due to a failed patch. That has since been resolved, but I am able to reproduce the capture issue with a simple bash script:
#!/bin/bash
X=1
until [ $X -gt 10 ]; do
echo "Count is $X"
let X=X+1
sleep 5
done
exit 1
The exit code there at the end will result in no logs being saved by the agent. Delete the "exit 1" line and the agent will write out the captured output as soon as the script completes.
Interestingly if I just put
echo "Error"
at the end instead of exit 1, and use the REGEX check it logs an execution error but still saves the output.
|
|
Thu Dec 04, 2014 5:36 pm |
|
 |
ddw
Joined: 22 Oct 2014 Posts: 10 Country: United States |
|
|
|
 |
 |
Also, if you enable tracing in the agent's settings, in case of a job crash, do you see its output written to debug.log file (it's in the agent's home directory) |
Oh, I had turned on trace on the master, but it had not occurred to me to turn it on on the agent.
|
|
Thu Dec 04, 2014 5:39 pm |
|
 |
ddw
Joined: 22 Oct 2014 Posts: 10 Country: United States |
|
update |
|
With tracing on, the exit code is logged in the debug.log file, but the actual output of the script is not saved anywhere.
I've now reproduced this on Solaris 8, Solaris 10, and RHEL 6.6. (The RHEL box is the master server in this case.)
#!/bin/bash
X=1
until [ $X -gt 10 ]; do
echo "Count is $X"
let X=X+1
sleep 5
done
exit 1
This script will have it's output saved under 24x7/Output/## if the "exit 1" line is commented out, but nothing will be saved if the exit line is present with a non-zero value.
I had planned to try upgrading my dev environment to the latest release to see if it made a difference, but most the servers we run jobs on are Solaris 8 still, and I can not find Java 7 for Solaris 8 anywhere.
|
|
Mon Dec 29, 2014 8:35 pm |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7948
|
|
|
|
Please try adding 1>&2 to the end of the scheduled command, for example, sh -c /home/user/scripts/whatever.sh 1>&2
This will cause the stdout ouput of a program to be written to the same filedescriptor than stderr. Hope this helps.
|
|
Tue Dec 30, 2014 12:59 am |
|
 |
ddw
Joined: 22 Oct 2014 Posts: 10 Country: United States |
|
|
|
Thanks. The 1>&2 didn't make any difference.
I've got a workaround though. Using a wrapper script like this preserves the job output and passes the exit condition back to the scheduler so it will report the error. It also gives me better control over where the output is stored, so I think this is what I'm going to end up doing.
#!/bin/ksh
LOGFILE="/tmp/foo.out"
$@ >> $LOGFILE 2>&1
EC=$?
echo "$EC" >> $LOGFILE
return $EC
|
|
Tue Dec 30, 2014 7:36 pm |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|