PrepServer Maintenance and Diagnostic

From RHESSI Wiki

Jump to: navigation, search

Contents

Starting and Stopping

The PrepServer can be started and restarted using the re-startup.csh script (see PrepServer Installation Instructions#Standard Configuration). Executing this script will start a Management Service instance (if non is running) and then start the IDL Pre-processing Server instances. Should the PrepServer be running already, it will stop all instances of the IDL Pre-processing Server and, before shutting them down, have them unregister with the Management Service. The Management Service instance will NOT be stopped. Should it be necessary to shut the PrepServer down, then the following commands need to be executed WITH CARE:

# List all services running as the PrepServer user and show all entries for the IDL Pre-processing Service
ps -u $PREP_USER | grep PreprocessorIdlService
 
# Pick all the PIDs related to the PrepServer and execute kill
kill IDL_SERVICE_PIDs
 
# Find the Management Service
ps -u $PREP_USER | grep PreprocessorManagementService
 
# Pick the Management Service PID and execute kill
kill MANAGEMENT_SERVICE_PID

If you are SURE the PrepServer user ($PREP_USER) only runs PrepServer related services, the stopping can be done quicker, HOWEVER, BE CAREFUL:

killall -u $PREP_USER java

Logging

The Log Files

The PrepServer components log to the following log files.

Server/Service Default Path Description
Front-end $CATALINA_HOME/logs/WebServiceAccessLogger.log Contains an access log with details on pre-processing requests and results. Can be changed (see PrepServer Installation Instructions#PrepServer startup parameters).
Front-end $CATALINA_HOME/logs/catalina.out Contains a general Tomcat log. If an error occurs in the web application, it will be recorded here.
Management Server $PREP_SERVER/logs/ManagementServiceMaintenanceLogger.log Records the management server's internal queues' state (Registered/Idle/Busy servers, Preprocessing requests).
Management Server $PREP_SERVER/logs/ManagementServiceSystemLogger.log Logs messages printed out by the general management server process (general errors and such).
Management Server $PREP_SERVER/logs/ManagementServiceProcessLogger.log Logs messages that occur while processing client requests.
Management Server $PREP_SERVER/logs/*_complete_mgmt.log A log that is written on shell level ensuring that even fatal exceptions for Java are recorded. * is the date and time the log was created.
Media Server $PREP_SERVER/logs/MediaServiceSystemLogger.log Logs messages printed out by the general media server process (general errors and such).
Media Server $PREP_SERVER/logs/MediaServiceProcessLogger.log Logs messages that occur while processing a client requests.
Pre-processing Server $PREP_SERVER/logs/IdlServiceSystemLogger.*.log Logs messages printed out by the general pre-processing server process (general errors and such). * is the IDL Pre-processing Server ID.
Pre-processing Server $PREP_SERVER/logs/IdlServiceProcessLogger.*.log Logs messages that occur while processing a client requests. * is the IDL Pre-processing Server ID.
Pre-processing Server $PREP_SERVER/logs/**_complete_idl_*.log A log that is written on shell level ensuring that even fatal exceptions for Java are recorded. * is the IDL Pre-processing Server ID, ** is the date and time the log was created.
SSW IDL $PREP_SERVER/logs/*_IDL_BATCH_RUN A log file that is written by the SSW IDL batch script that internally executes the SSW IDL environment initialization.

The Logging Pattern

The log files contain many debug related log entries, which document the status of the server (ManagementServiceMaintenanceLogger), the pre-processing (IdlServiceProcessLogger), etc. Interpreting the logs is essential in order to find errors and bugs in the system. The general logging pattern is:
(<TIME>) <LOGLEVEL> <SOURCE> - <MESSAGE>

The PrepServer components log on debug level to the following log files.

Element Description Example
Time The time and date the output was written to the file. Not available in all files 2010-09-01 10:00:00
Log Level Indicates the "level" of the log entry INFO, DEBUG, WARNING, ERROR, or FATAL
Source The source tells where the log messages originated. Especially for exceptions, this may be very valuable for debugging purposes. ManagementServiceSystemLogger
Message This is the actual message or error message. Initializing XYZ

Reading Log Files

A few tools and procedures help analyzing the log files.

Tool Description Example
tail Tail opens a log file and prints all new lines to the screen (mirroring). This is especially useful when starting a pre-processing request in the local client IDL session an tracking the output on the server.
tail -f ManagementServiceSystemLogger.log
cat Cat prints the complete log file to the screen. Most useful with grep (see below).
cat ManagementServiceSystemLogger.log
cat + grep Cat prints the complete log which is consumed by grep and filtered for certain words. This allows for searching specific keywords.
cat ManagementServiceSystemLogger.log | grep "success"
nano Nano is, like vi, a command-line editor.
nano ManagementServiceSystemLogger.log

Java Stack Traces

A Java Stack Trace can occur in most of the above mentioned log files and it is written to the log whenever Java encounters an error, e.g. in pre-processing or while reading or writing a file. The source for such an error can be in IDL or in Java itself. An error message that contains something like "com.idl.javaidl.JIDLException[iErr=-5 sMsg=MAP::WRITE: Writing file" or "com.idl.javaidl.JIDLException[iErr=-133 sMsg=Program caused arithmetic error: Floating illegal operand" originated in IDL. An error that contains "ERROR IdlServiceProcessLogger - An error occurred while pre-processing. java.lang.NullPointerException at PreprocessorIdlService.createPreprocessor" was encountered in Java. A Java Stack Trace always has the following structure:

ERROR IdlServiceProcessLogger  - An error occurred while pre-processing.
java.lang.NullPointerException
	at gov.nasa.gsfc.jidl.vps.server.prep.idl.services.PreprocessorIdlService.createPreprocessor(Unknown Source)
	at gov.nasa.gsfc.jidl.vps.server.prep.idl.services.PreprocessorIdlService.preprocess(Unknown Source)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
	at java.lang.reflect.Method.invoke(Unknown Source)
 
Exception in thread "pool-1-thread-20" com.idl.javaidl.JIDLException[iErr=-133 sMsg=Program caused arithmetic error: Floating illegal operand]
	at com.idl.javaidl.JIDLPAL.nativeThrowJIDLException(Native Method)
	at com.idl.javaidl.JIDLPAL.throwSpecificException(JIDLPAL.java:1073)
	at com.idl.javaidl.JIDLPAL.throwJIDLException(JIDLPAL.java:1102)
	at com.idl.javaidl.JIDLPAL.callProcedure(JIDLPAL.java:474)
	at com.idl.javaidl.JIDLObject.callProcedure(JIDLObject.java:389)
	at gov.nasa.gsfc.jidl.vps.server.prep.idl.bridge.PreprocessorIdlBridge.preprocess(Unknown Source)
	at gov.nasa.gsfc.jidl.vps.server.prep.idl.services.PreprocessorIdlService.preprocess(Unknown Source)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)

Detecting stuck IDL Pre-processing Servers

The ManagementServiceMaintenanceLogger.log will print on a regular basis the number of idle, busy, and registered IDL Pre-processing Servers as well as preprocessing and waiting requests.

DEBUG ManagementServiceMaintenanceLogger  - Waiting requests: 0
DEBUG ManagementServiceMaintenanceLogger  - Preprocessing/Queued requests: 0
DEBUG ManagementServiceMaintenanceLogger  - Idle servers: 5
DEBUG ManagementServiceMaintenanceLogger  - Busy servers: 0
DEBUG ManagementServiceMaintenanceLogger  - Registered servers: 5

An indication for a stuck server is given if there are waiting requests while all servers are idle. In such a case the PrepServer needs rebooting. If that doesn't help, even the Management Service must be restarted.

Disk Space Maintenance

The PrepServer keeps a history of log files and stores temporary data used for pre-processing; those data are not automatically deleted. The amount of data collected is significant and must therefore be removed periodically, preferably at a time the PrepServer is shut down completely. The following list shows directories where data is collected.

Directory Description Considerations
$PREP_SERVER/logs This is the default location for the PrepServer log files. This folder contains logs that could be used for anonymous usage statistics, which may be worth keeping or analyzing before deleting
$PREP_SERVER/tmp This is the default location for temporary files used by the IDL Pre-processing Server. Do not delete those files while an IDL Pre-processing Server is working on a file.
$CATALINA_HOME/webapps/prepserver/public This is the default location for user-uploaded level-0 data, and downloadable pre-processed level-1 data. -

Extracting Anonymous Usage Data

The log file WebServiceAccessLogger.log records every user request and can be used to calculate the per day access rate. The following command will retrieve those data and write them to a temporary file, which can be downloaded and analyzed with e.g. Excel or Numbers (each line is a new request).

cat WebServiceAccessLogger.log | grep "New request" > $HOME/usage.stats.txt

The PrepServer logging facilities will automatically archive old logs. The below command will include them as well.

cat WebServiceAccessLogger.log* | grep "New request" > $HOME/usage.stats.txt
Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox