Author |
Message |
klwong
Joined: 07 Mar 2009 Posts: 22 Country: Australia |
|
24x7 scheduler failed |
|
For some reason the 24x7 schedule failed (abort) everyday. What is the best way to troubleshoot the problem ?
|
|
Wed Jul 08, 2009 8:34 am |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7949
|
|
|
|
Does it occur at the same time every day?
Do you have a scheduled restart or something like that configured in the settings?
What does the job log say? I mean events occurring before that.
|
|
Wed Jul 08, 2009 9:11 am |
|
 |
klwong
Joined: 07 Mar 2009 Posts: 22 Country: Australia |
|
|
|
Yes, it did abort at around 15:30 each day, based on Window event viewer.
I checked the configuration and it did (looks like default) have restart option configured for Mon,Wed,Fri at 12:00am, which does not matched with the time of the about.
The 24x7 scheduler job log is normal, no error at all.
What I did now is to change the job to 'detach' and see how it goes tomorrow. Any other suggestions ?
|
|
Wed Jul 08, 2009 9:33 am |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7949
|
|
|
|
It might be related to scheduled auto-restart option. There is a known problem in 3.5 release causing it not to work on some systems. Please see http://www.softtreetech.com/support/phpBB2/viewtopic.php?t=22836 message thread for more info. Note that this is fixed in 3.6, which is in a process of being released this week.
|
|
Wed Jul 08, 2009 10:21 am |
|
 |
klwong
Joined: 07 Mar 2009 Posts: 22 Country: Australia |
|
|
|
I have auto-restart disabled, also change all jobs to detach mode and the problem seems to happen less frequent, but still happening every 3 days. Are there anyway to troubleshoot further ??
|
|
Mon Jul 13, 2009 6:49 pm |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7949
|
|
|
|
Try updating your version to 3.6 releases which is out today. Please let us know if that helps. Also I know that you are using email-watch jobs. Please ensure that the antivirus isn't blocking some emails and there are no other blocking email client issues. If present, such issues can cause completely suspension of 24x7.activities.
|
|
Tue Jul 14, 2009 1:17 am |
|
 |
klwong
Joined: 07 Mar 2009 Posts: 22 Country: Australia |
|
|
|
Updated to the latest version and 24x7 still crash. Not sure if it relates to the multiple email-watch job I am having as I noticed at peak time where all messages coming in within the 30 minutes duration, all jobs (3 of them) detect email but I can see from the job log that they start in sequential order and not overlapping. I now add a 'wait 2' for each job just in case the previous job need to take sometime to finish for whatever reasons. Confirm antivirus software are not scanning emails....
|
|
Thu Jul 16, 2009 6:19 am |
|
 |
klwong
Joined: 07 Mar 2009 Posts: 22 Country: Australia |
|
|
|
24x7 scheduler still crashing after I put a pause on each email watch job. Any more suggestions most appreciate.
|
|
Sun Jul 19, 2009 5:59 pm |
|
 |
SysOp
Site Admin
Joined: 26 Nov 2006 Posts: 7949
|
|
|
|
Such jobs cannot start a few seconds apart but cannot start exactly at the same time as there is only one email monitor within 24x7 interfacing with your email client. You are still using MAPI, right?
There are 2 possibilities: (1) the problem manifests while scanning email, and , and (2) when running several competing jobs.
To eliminate the 2nd possibility, please ensure that all effected jobs are running detached. Consider that detached job mode is a must have If they are not set to detached mode, please them as required. If they are already set detached, we are left with the 1st possibility. I suggest to start DrWatson and wait for the next crash. Review Dr.Watson crash log file and find out which function breaks during email check – this could be in 24x7, could be in MAPI, could be in your email client or anywhere else. Dr.Watson log should provide an answer for that.
|
|
Mon Jul 20, 2009 12:15 am |
|
 |
klwong
Joined: 07 Mar 2009 Posts: 22 Country: Australia |
|
|
|
Sorry did not pick up the last response, hence did not install Dr Watson to check, will do it now.
For your information. The jobs are detached, also I configure the system such that it will restart the service everyday at 03:00, so far so good and no more crash. Anyway will disable to 'restart' and try to capture via Dr Watson, will take less than 30 hours for the next incident.
|
|
Sat Jul 25, 2009 5:10 am |
|
 |
|