Thursday, September 25, 2008

workarounds when config files get corrupted in BPEL & ESB

Some days back while I was working on a high availability environment, I got an issue which looked interesting to be shared in the blog. When I was trying to deploy new processes, it was throwing an error “not enough disk space”. I never used to monitor the free space available on the disk. But due to multiple installations and frequent backups the disk space was really running low. What happened after this was the most unexpected. The resource adapter xml’s(oc4j-ra.xml & oc4j-connectors.xml) of orabpel, esb-rt and hw_services got corrupted. Then I thought I can copy from the other node of the cluster, but to my surprise that was also corrupted.

I will take you through some workarounds.

workaround#1

For this workaround to work you should be taking backups of your soa suite installation (mandatory -- one backup fresh after installation).

The best thing is to restore the folders from the backup taken previously. To be more specific you can replace the folder NODE_HOME/ORACLE_HOME/j2ee/OC4J_INSTANCE_NAME/application-deployments/orabpel for orabpel corruption. NODE_HOME/ORACLE_HOME/j2ee/OC4J_INSTANCE_NAME/application-deployments/esb-rt for esb corruption and NODE_HOME/ORACLE_HOME/j2ee/OC4J_INSTANCE_NAME/application-deployments/ hw_services for hw_services file corruption.

workaround #2

When you are faced with issues like BPEL and ESB console not working, first you need to stop the server .Before starting the server just tail your log file in NODE_HOME/ORACLE_HOME/opmn/logs/ groupname~oc4j_instancename~groupname ~1.log or check this log file after you start the server. The errors while loading these applications on startup will be listed there. Solve one by one. This is from where I understood oc4j-ra.xml & oc4j-connectors.xml are corrupted.This workaround is similar to 2nd one. You only need to copy the corrupted files as listed in the log file and replace the corresponding existing configuration files.

workaround #3

Third workaround is to redeploy the orabpel.ear, esb-rt.ear from the em console. I wanted to try this option (as I felt this challenging) so I copied these ears to my machine from server as backup. Then deployed one by one on both nodes of the cluster. First deployment was a failure, but it wiped out the existing orabpel from the application and applications-deployment folder. Due to this cleanup, the second time deployment was fine. Similarly I did esb also. After that I restarted the servers, both consoles were working properly and I was able to deploy the processes.

workaround #4

This is the last option. If you feel your environment is too badly screwed up go for reinstall option. Re-install only the esb and bpel servers using the installer.

Before I windup moral of the blog – always monitor the disk space available in your sever machines so that you don’t end up looking for these optionsJ.

No comments: