SoftTree Technologies SoftTree Technologies
Technical Support Forums
RegisterSearchFAQMemberlistUsergroupsLog in
Scheduler died and won't start again

 
Reply to topic    SoftTree Technologies Forum Index » 24x7 Scheduler, Event Server, Automation Suite View previous topic
View next topic
Scheduler died and won't start again
Author Message
barefootguru



Joined: 10 Aug 2007
Posts: 195

Post Scheduler died and won't start again Reply with quote
Last night 24x7 died and now I can't restart it. I need some help on resurrecting it, reasonably promptly.

The debug log shows it finished the 23:00 job and then nothing:

Quote:
2007-11-14 23:00:00,012 [Timer-10032] DEBUG com.softtreetech.jscheduler.business.queue.JobQueue - QUEUE_UNLOCKED
2007-11-14 23:00:02,934 [Job #35 - sitecopy hsweb] DEBUG com.softtreetech.jscheduler.business.runner.ProgramJobRunner - runJob(): start
2007-11-14 23:00:02,934 [Job #35 - sitecopy hsweb] DEBUG com.softtreetech.jscheduler.business.runner.ProgramJobRunner - execProcess(): command line [webupdate.bat hsweb] in work directory [null]
2007-11-14 23:00:02,934 [Job #35 - sitecopy hsweb] DEBUG com.softtreetech.jscheduler.business.runner.ProgramJobRunner - waitForProcess(): start
2007-11-14 23:00:02,934 [Thread-430406] DEBUG com.softtreetech.jscheduler.business.runner.AbstractJobRunner$TimeoutVerifier - run(): start
2007-11-14 23:00:02,934 [Thread-430406] DEBUG com.softtreetech.jscheduler.business.runner.AbstractJobRunner$TimeoutVerifier - run(): end due to zero timeout
2007-11-14 23:14:02,950 [Job #35 - sitecopy hsweb] DEBUG com.softtreetech.jscheduler.business.runner.ProgramJobRunner - waitForProcess(): end
2007-11-14 23:14:02,950 [Job #35 - sitecopy hsweb] DEBUG com.softtreetech.jscheduler.business.runner.AbstractJobRunner - isFailed(...) : exit code 0
2007-11-14 23:14:02,950 [Job #35 - sitecopy hsweb] DEBUG com.softtreetech.jscheduler.business.runner.ProgramJobRunner - killProcess start
2007-11-14 23:14:02,950 [Job #35 - sitecopy hsweb] DEBUG com.softtreetech.jscheduler.business.runner.ProgramJobRunner - runJob(): end
2007-11-14 23:14:05,872 [Job #35 - sitecopy hsweb] DEBUG com.softtreetech.jscheduler.business.queue.JobQueue - QUEUE_UNLOCKED


Now when I try to start 24x7 (master.bat), it gets as far as displaying the dialog saying the job log file is over 200 KB, then hangs (can't click Yes or No).

The debug logs shows:

Quote:
2007-11-15 15:25:11,766 [main] DEBUG com.softtreetech.jscheduler.JSchedulerStarter - main(...) : start
2007-11-15 15:25:11,766 [main] DEBUG com.softtreetech.jscheduler.JSchedulerStarter - startup() : start
2007-11-15 15:25:12,018 [main] DEBUG com.softtreetech.jscheduler.JSchedulerStarter - startup() : creating business objects
2007-11-15 15:25:12,396 [main] DEBUG com.softtreetech.jscheduler.JSchedulerStarter - startup() : creating UI controller
2007-11-15 15:25:12,569 [main] DEBUG com.softtreetech.jscheduler.JSchedulerStarter - startup() : initializing business objects
2007-11-15 15:25:12,569 [main] DEBUG com.softtreetech.jscheduler.business.preferences.AbstractPrefDatabase - Creating backup for preferences file preferences.xml
2007-11-15 15:25:12,585 [main] DEBUG com.softtreetech.jscheduler.business.preferences.AbstractPrefDatabase - Preferences file has been copied to preferences.bak


Windows updated itself at midnight, so I'm assuming something broke during the reboot.

Latest version of 24x7 multi-platform edition, Windows XP.

How can I fix this?

Thanks
Wed Nov 14, 2007 10:45 pm View user's profile Send private message
SysOp
Site Admin


Joined: 26 Nov 2006
Posts: 7838

Post Reply with quote
The log file is likely corrupted if the scheduler process got killed during log writing operation. Please rename schedule.log and restart the scheduler.

Generally, it is a good idea to create a maintenance job to periodically rename schedule.log file adding a date suffix and zipping it to a different archiving directory just for auditing purposes so that that it doesn't get too big and yet if you need to see past activities you can have a copy of that old file. Since you running it on Windows, you can schedule cmd /C rename schedule.log schedule-@T"yyyy-mm-dd".log command to rename the file, or you can create a JavaScript job using available File functions for the same task.
Thu Nov 15, 2007 12:11 am View user's profile Send private message
barefootguru



Joined: 10 Aug 2007
Posts: 195

Post Reply with quote
That didn't fix it, but it did help me find the problem...

Once I was in 24x7 I realised mouse-clicks weren't recognised, but buttons would highlight if I rolled over them, and the keyboard worked fine (no other apps were affected).

I bumped Java from 1.4.2_03 to 1.4.2_16 and rebooted and all is well again. Maybe something in the Microsoft update conflicted with the older version of Java? It had been working fine for months...

Point noted about the log file--does 24x7 do any automatic archiving? There's a couple of scheduler.lognnnnnnnnnnnnn files in the directory already.

Thanks for your time.
Thu Nov 15, 2007 4:02 am View user's profile Send private message
SysOp
Site Admin


Joined: 26 Nov 2006
Posts: 7838

Post Reply with quote
It doesn't do it for archiving purposes. I believe these log files are generated when the scheduler is unable to write to the main file or a job running detached cannot concurrently write to it, which is a very rare instance, but possible. Form best practices, it is a good idea to set a weekly job to archive log files or delete them if they are not needed.
Thu Nov 15, 2007 10:53 am View user's profile Send private message
barefootguru



Joined: 10 Aug 2007
Posts: 195

Post Reply with quote
SysOp wrote:
Since you running it on Windows, you can schedule cmd /C rename schedule.log schedule-@T"yyyy-mm-dd".log

This doesn't work—looks like 24x7 has a lock on the logfile...

(I don't want to shut down 24x7 because then I can't automate the log file rotation.)
Sun Nov 18, 2007 9:32 pm View user's profile Send private message
SysOp
Site Admin


Joined: 26 Nov 2006
Posts: 7838

Post Reply with quote
Yes, that was a bad idea. Indeed the file is open for writing during normal operations and cannot be renamed. I can offer you a utility that can be used together with file copy to achieve the same result.

To download this utility, go to http://www.softtreetech.com/24x7/archive/59.htm
Schedule a batch file like rename_log.bat @T"yyyymmdd" and in this batch file enter something like

Code:
copy scheduler.log scheduler%1.log
FileTruncate scheduler.log

Mon Nov 19, 2007 12:10 am View user's profile Send private message
barefootguru



Joined: 10 Aug 2007
Posts: 195

Post Reply with quote
Are you sure 24x7 is happy with the log file being truncated while it's running?

I've scheduled a job which copies the log and truncates it. The job finish message is written to the empty log, but after that 24x7 seems to be stuck--it looks like the next job is in the queue but isn't written to the log and it never actually runs.

(24x7 is running as a started task under Windows XP)
Fri Nov 23, 2007 4:39 pm View user's profile Send private message
SysOp
Site Admin


Joined: 26 Nov 2006
Posts: 7838

Post Reply with quote
I setup a test system running latest build #4.1.247 with 10 jobs running with various short intervals like 1 or 2 minutes and a job running FileTruncate once an hour. I also set max log size limit to 500 entries and let it run for a day. Everything seems to be fine so far. The log on the screen is limited to the last 500 entries although I have to admit I didn't keep it open all the time and only recently checked the content. The scheduler.log file content is limited to whatever is added after the most recent FileTruncate. I will continue running this setup for a while and keep an eye on it. So far it looks like FileTruncate is not causing any problems.
Sat Nov 24, 2007 7:49 pm View user's profile Send private message
barefootguru



Joined: 10 Aug 2007
Posts: 195

Post Reply with quote
Thanks for taking the time to test this. We're still running 4.1 242, so I could try the later version (we're moving machines in a week so I hadn't bothered upgrading the old one).

But reading through the manual it states 24x7 does delete old entries from the log:

Quote:
The default value is 3000 log records. When maximum number is reach 24x7 Scheduler automatically performs log rotation, deleting oldest records when new records are added.


So can I just rely on this?

We already have a daily audit job which records new entries in the log, so I could drop the number of records down.
Sun Nov 25, 2007 2:27 pm View user's profile Send private message
SysOp
Site Admin


Joined: 26 Nov 2006
Posts: 7838

Post Reply with quote
Well, that's statement is only partially correct. It was stated in the manual before introduction of detached jobs and somehow left there ever since.
If you have multiple jobs running concurrently in detached mode (which is good), you can still have more records in the log file than what would appear on the screen in the Log Viewer because some of the messages are written to the log behind the scenes. The main log maintaining process makes attempts to monitor the log content and synchronize it with the memory cache copy, but I am not sure the implementation guarantees having exact copies on the disk and in memory. There are no side-effects other than the log is not always has a fix size and can potentially grow unrestricted.

Anyway, the suggestion to use log truncation and archiving was made in response to the original issue you experienced with the corrupted log file as a way to keep the log small. If you don't feel like you want to use this method, you don't have to.

If you have any suggestions for how to improve the logging processes, please let us know.
Sun Nov 25, 2007 3:52 pm View user's profile Send private message
barefootguru



Joined: 10 Aug 2007
Posts: 195

Post Reply with quote
The original problem turned out not to be a corrupted log file, so I'm happy with the current system where 24x7 drops old entries off the log--I don't think it's worth either of us chasing the filetruncate issue. I have reduced it from (approx) 3000 entries to 2000.

Thanks again for your time.
Sun Nov 25, 2007 4:50 pm View user's profile Send private message
Display posts from previous:    
Reply to topic    SoftTree Technologies Forum Index » 24x7 Scheduler, Event Server, Automation Suite All times are GMT - 4 Hours
Page 1 of 1

 
Jump to: 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


 

 

Powered by phpBB © 2001, 2005 phpBB Group
Design by Freestyle XL / Flowers Online.