PrepServer Maintenance and Diagnostic

From RHESSI Wiki

(Difference between revisions)
Jump to: navigation, search
(Extended version)
m (Added todo)
Line 22: Line 22:
== Logging ==
== Logging ==
 +
=== The Log Files ===
The PrepServer components log to the following log files.
The PrepServer components log to the following log files.
{| border="1" cellspacing="0" cellpadding="5" align="left" width="100%"
{| border="1" cellspacing="0" cellpadding="5" align="left" width="100%"
Line 68: Line 69:
|-
|-
|}
|}
 +
 +
=== Reading Log Files ===
 +
Using tail, cat, cat | grep, nano.
=== Java Stack Traces ===
=== Java Stack Traces ===

Revision as of 13:03, 28 September 2010

Contents

Starting and Stopping

The PrepServer can be started and restarted using the re-startup.csh script (see PrepServer Installation Instructions#Standard Configuration). Executing this script will start a Management Service instance (if non is running) and then start the IDL Pre-processing Server instances. Should the PrepServer be running already, it will stop all instances of the IDL Pre-processing Server and, before shutting them down, have them unregister with the Management Service. The Management Service instance will NOT be stopped. Should it be necessary to shut the PrepServer down, then the following commands need to be executed WITH CARE:

# List all services running as the PrepServer user and show all entries for the IDL Pre-processing Service
ps -u $PREP_USER | grep PreprocessorIdlService
 
# Pick all the PIDs related to the PrepServer and execute kill
kill IDL_SERVICE_PIDs
 
# Find the Management Service
ps -u $PREP_USER | grep PreprocessorManagementService
 
# Pick the Management Service PID and execute kill
kill MANAGEMENT_SERVICE_PID

If you are SURE the PrepServer user ($PREP_USER) only runs PrepServer related services, the stopping can be done quicker, HOWEVER, BE CAREFUL:

killall -u $PREP_USER java

Logging

The Log Files

The PrepServer components log to the following log files.

Server/Service Default Path Description
Front-end $CATALINA_HOME/logs/WebServiceAccessLogger.log Contains an access log with details on pre-processing requests and results. Can be changed (see PrepServer Installation Instructions#PrepServer startup parameters).
Front-end $CATALINA_HOME/logs/catalina.out Contains a general Tomcat log. If an error occurs in the web application, it will be recorded here.
Management Server $PREP_SERVER/logs/ManagementServiceMaintenanceLogger.log Records the management server's internal queues' state (Registered/Idle/Busy servers, Preprocessing requests).
Management Server $PREP_SERVER/logs/ManagementServiceSystemLogger.log Logs messages printed out by the general management server process (general errors and such).
Management Server $PREP_SERVER/logs/ManagementServiceProcessLogger.log Logs messages that occur while processing client requests.
Management Server $PREP_SERVER/logs/*_complete_mgmt.log A log that is written on shell level ensuring that even fatal exceptions for Java are recorded. * is the date and time the log was created.
Media Server $PREP_SERVER/logs/MediaServiceSystemLogger.log Logs messages printed out by the general media server process (general errors and such).
Media Server $PREP_SERVER/logs/MediaServiceProcessLogger.log Logs messages that occur while processing a client requests.
Pre-processing Server $PREP_SERVER/logs/IdlServiceSystemLogger.*.log Logs messages printed out by the general pre-processing server process (general errors and such). * is the IDL Pre-processing Server ID.
Pre-processing Server $PREP_SERVER/logs/IdlServiceProcessLogger.*.log Logs messages that occur while processing a client requests. * is the IDL Pre-processing Server ID.
Pre-processing Server $PREP_SERVER/logs/**_complete_idl_*.log A log that is written on shell level ensuring that even fatal exceptions for Java are recorded. * is the IDL Pre-processing Server ID, ** is the date and time the log was created.
SSW IDL $PREP_SERVER/logs/*_IDL_BATCH_RUN A log file that is written by the SSW IDL batch script that internally executes the SSW IDL environment initialization.

The Logging Pattern

The log files contain many debug related log entries, which document the status of the server (ManagementServiceMaintenanceLogger), the pre-processing (IdlServiceProcessLogger), etc. Interpreting the logs is essential in order to find errors and bugs in the system. The general logging pattern is:
<LOGLEVEL> <SOURCE> <MESSAGE>

The PrepServer components log on debug level to the following log files.

Element Description Example
Log Level Indicates the "level" of the log entry INFO, DEBUG, WARNING, ERROR, or FATAL
Source The source tells where the log messages originated. Especially for exceptions, this may be very valuable for debugging purposes. ManagementServiceSystemLogger
Message This is the actual message or error message. Initializing XYZ

Reading Log Files

Using tail, cat, cat | grep, nano.

Java Stack Traces

A Java Stack Trace can occur in most of the above mentioned log files and it is written to the log whenever Java encounters an error, e.g. in pre-processing or while reading or writing a file. The source for such an error can be in IDL or in Java itself. An error message that contains something like "com.idl.javaidl.JIDLException[iErr=-5 sMsg=MAP::WRITE: Writing file" or "com.idl.javaidl.JIDLException[iErr=-133 sMsg=Program caused arithmetic error: Floating illegal operand" originated in IDL. An error that contains "ERROR IdlServiceProcessLogger - An error occurred while pre-processing. java.lang.NullPointerException at PreprocessorIdlService.createPreprocessor" was encountered in Java. A Java Stack Trace always has the following structure:

ERROR IdlServiceProcessLogger  - An error occurred while pre-processing.
java.lang.NullPointerException
	at gov.nasa.gsfc.jidl.vps.server.prep.idl.services.PreprocessorIdlService.createPreprocessor(Unknown Source)
	at gov.nasa.gsfc.jidl.vps.server.prep.idl.services.PreprocessorIdlService.preprocess(Unknown Source)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
	at java.lang.reflect.Method.invoke(Unknown Source)
 
Exception in thread "pool-1-thread-20" com.idl.javaidl.JIDLException[iErr=-133 sMsg=Program caused arithmetic error: Floating illegal operand]
	at com.idl.javaidl.JIDLPAL.nativeThrowJIDLException(Native Method)
	at com.idl.javaidl.JIDLPAL.throwSpecificException(JIDLPAL.java:1073)
	at com.idl.javaidl.JIDLPAL.throwJIDLException(JIDLPAL.java:1102)
	at com.idl.javaidl.JIDLPAL.callProcedure(JIDLPAL.java:474)
	at com.idl.javaidl.JIDLObject.callProcedure(JIDLObject.java:389)
	at gov.nasa.gsfc.jidl.vps.server.prep.idl.bridge.PreprocessorIdlBridge.preprocess(Unknown Source)
	at gov.nasa.gsfc.jidl.vps.server.prep.idl.services.PreprocessorIdlService.preprocess(Unknown Source)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)

Detecting stuck IDL Pre-processing Servers

The ManagementServiceMaintenanceLogger.log will print on a regular basis the number of idle, busy, and registered IDL Pre-processing Servers as well as preprocessing and waiting requests.

DEBUG ManagementServiceMaintenanceLogger  - Waiting requests: 0
DEBUG ManagementServiceMaintenanceLogger  - Preprocessing/Queued requests: 0
DEBUG ManagementServiceMaintenanceLogger  - Idle servers: 5
DEBUG ManagementServiceMaintenanceLogger  - Busy servers: 0
DEBUG ManagementServiceMaintenanceLogger  - Registered servers: 5

An indication for a stuck server is given if there are waiting requests while all servers are idle. In such a case the PrepServer needs rebooting. If that doesn't help, even the Management Service must be restarted.

Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox