SoftTree Technologies SoftTree Technologies
Technical Support Forums
RegisterSearchFAQMemberlistUsergroupsLog in
Restart Job Usage

 
Reply to topic    SoftTree Technologies Forum Index » 24x7 Scheduler, Event Server, Automation Suite View previous topic
View next topic
Restart Job Usage
Author Message
mirrera



Joined: 28 Feb 2008
Posts: 5
Country: United States

Post Restart Job Usage Reply with quote
Hello, we are relatively new 24x7 users, and I have a question about the job restart feature.

We are running v 3.4.27 on a W2k3 server. Many of the jobs run PL/SQL procedures, and some of these we would like to have automatically restart on failure.

My assumption has been that simply checking the "Restart this job if it fails" on the 9th screen of the job wizard, coupled with providing values for the retry interval, and number of retries, would be enough to enable this.

I have a test job that calls a procedure which will error out. In the log, I get "Attempt 1 of 3 to run this job failed, will retry in 60 seconds", yet the job never re-runs. The job was scheduled as a one-time run.

This seems pretty basic to me, and I've searched the site and haven't found any other posts with the same problem -- what am I doing wrong?

Thanks.
Thu Feb 28, 2008 2:34 pm View user's profile Send private message
barefootguru



Joined: 10 Aug 2007
Posts: 195

Post Reply with quote
This is on my list of things to investigate as well. I have an FTP job which sometimes fails to make a connection, so I've set the job definition for 1 retry in 60 seconds.

1. I'll always receive the 'job error' e-mail, though would rather only get this after the second fail. I don't want to be notified on the first attempt because I've told 24x7 this is OK.

2. Sometimes the second run never happens.

24x7 multi-platform 4.1 427
Thu Feb 28, 2008 2:48 pm View user's profile Send private message
SysOp
Site Admin


Joined: 26 Nov 2006
Posts: 7949

Post Reply with quote
We are looking into this.
Thu Feb 28, 2008 3:03 pm View user's profile Send private message
mirrera



Joined: 28 Feb 2008
Posts: 5
Country: United States

Post Reply with quote
Has there been any progress on this?
Tue Mar 11, 2008 10:00 am View user's profile Send private message
SysOp
Site Admin


Joined: 26 Nov 2006
Posts: 7949

Post Reply with quote
Here is what I found out. There is surely some design efficiency.

If the job is NOT set to run detached, than everything is ok. The scheduler restarts it properly as many times as specified in retry property and sends an error email only after the last failed job run.

If the job is set to run detached, in other words as a separate process, the scheduler spawns the process and let it go. For script type jobs including SQL scripts, it doesn't know if the spawned process failed or not. The spawned process always return 0 as the process exit code. The scheduler shares the log file with all detached jobs and that's why you can see "failed" reported in the log.

On the other hand, the detached job is physically separated from the scheduler and it doesn't know f the scheduler is going to restart it or not in case of failure. That's why it always sends an error email when it fails.

As you can see the solution to this issue is to simply uncheck the detached property. But there is catch -- jobs running non-detached, share memory with the scheduler. If they leak any system resources (for example, virtually all db drivers do that) the lost resources accumulate over time and may potentially affect the system ability to run new jobs.

Sorry if the explanation above is too wordy.
Tue Mar 11, 2008 4:39 pm View user's profile Send private message
barefootguru



Joined: 10 Aug 2007
Posts: 195

Post Reply with quote
Thanks for the explanation, it's not too wordy. The answer's a bit disappointing though, I hope this is on the longer term list to address.

(The particular job which fails here is convoluted and runs hourly, so I want it to stay detached. I'll wrap some shell scripting around it instead.)
Tue Mar 11, 2008 11:22 pm View user's profile Send private message
mirrera



Joined: 28 Feb 2008
Posts: 5
Country: United States

Post Reply with quote
Yes, thanks for the reply. In my case, I can change them to not run detached, but I'm running 10 or so jobs a day, so it's not a preferred workaround. I, too, hope it's on the fix list.
Wed Mar 12, 2008 11:51 am View user's profile Send private message
SysOp
Site Admin


Joined: 26 Nov 2006
Posts: 7949

Post Reply with quote
We are working on these issues. Hope to see relevant changes implemented in the next maintenance version for each product.
Wed Mar 12, 2008 2:24 pm View user's profile Send private message
SysOp
Site Admin


Joined: 26 Nov 2006
Posts: 7949

Post Reply with quote
Here is a status update.

Both 24x7 Scheduler Multi-platform version and Windows-only versions have been updated to handle detached jobs and their notification actions as in non-detached jobs. IT is now much easier to use detached jobs for job chaining, attach error handling actions and retry after error options. Most changes are made in the Windows-only version because the Multi-platform version supported all these options for ages, except the retry after error option.

For more details please see
http://www.softtreetech.com/support/phpBB2/viewtopic.php?t=22268
http://www.softtreetech.com/support/phpBB2/viewtopic.php?t=22267
Wed Apr 02, 2008 11:33 am View user's profile Send private message
Display posts from previous:    
Reply to topic    SoftTree Technologies Forum Index » 24x7 Scheduler, Event Server, Automation Suite All times are GMT - 4 Hours
Page 1 of 1

 
Jump to: 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


 

 

Powered by phpBB © 2001, 2005 phpBB Group
Design by Freestyle XL / Flowers Online.