Debugging Tips

If you are having problems getting DPCL to run properly on your system, the following tips may be helpful.

DPCL fails, reporting status of ASC_failed_rhost_check

Linux DPCL implementation fails on assert in dpcl/src/dyninstAPI/src/os/linux/ibmBPatch/ibmBPatchThread.C

DPCL fails, reporting status of ASC_communication failure

Debugging a DPCL crash

Building a private copy of DPCL to develop or debug DPCL

Problem:

DPCL fails, reporting status of ASC_failed_rhost_check.

Tips:

DPCL uses the .rhosts file in your home directory to determine if your userid is authorized to run processes on a host system. This file must contain an entry allowing your userid access to the system. This file must also have correct file permissions, read/write by user and read-only for group and other (0644). Access will be allowed only if these conditions are met.

You may also get this failure if your .rhosts file is in a directory protected by DCE credentials where the directory is not world-readable, for instance, a home directory in a DFS filesystem. In this case, you must install the PSSP security subsystem package and configure it, using the chauthts command, to set the security model to either DCE or DCE + COMPAT. If the PSSP security subsystem is not installed and properly configured, DPCL will be unable to read the .rhosts file, and this error status will result.

Problem:

Linux DPCL implementation fails on an assert in dpcl/src/dyninstAPI/src/os/linux/ibmBPatch/ibmBPatchThread.C.
Note: You may need to follow the steps described in the section debugging a DPCL crash as part of the process of identifying this problem.
The source code in ibmBPatchThread.C near the assert looks like

ibmBPatchFunction * space = findUniqueFunction (_image, "ibmBPatchSpace");
ibmBPatchFunction * rtinit = findUniqueFunction (_image, "ibmBPatchRtinit");

// If the following assert fails, verify that the target application has
// been linked with ibmBPatchRtinit.o. This object module contains the
// relevant symbols.
assert ((space != NULL) && (rtinit != NULL));

Tips:

The most likely cause for this failure is that the target application has not been linked with the required run time support modules. DPCL is attempting to determine the addresses of two support functions and is unable to resolve those addresses.
An example makefile showing the proper files to be linked is located in the DPCL source at dpcl/src/samples/linux/hello/Makefile.hello.

Problem:

DPCL fails, reporting status of ASC_communication_failure.

Tips:

DPCL is a client/server application which runs two daemons on each host where you are running an application under DPCL control. DPCL establishes a socket connection between the client and these daemons, a DPCL super daemon and a DPCL daemon to process DPCL client requests. If the DPCL super daemon abnormally terminates during initial connection processing, or if the DPCL daemon abnormally terminates while the DPCL client is active, then this status will be reported.

The first step is to consider upgrading to the latest released level of DPCL for the version you are using, since that level may contain fixes that resolve your problem.

The next step is to look for log files that may be written to the /tmp directory. The DPCL super daemon creates log files in /tmp and deletes them on normal termination. If the DPCL super daemon abnormally terminates, it will not delete the log file. The filename of the log file is /tmp/dpclSD.<pid> where <pid> is the process id of the DPCL super daemon. This log file may contain messages that help explain the failure. note: the DPCL super daemon log file is readable only by root.

If these steps do not solve your problem, then continue with the steps in the next section, debugging a DPCL crash.

Debugging a DPCL crash

The first step is to determine if DPCL wrote a core file at the time of the abnormal termination. For a DPCL super daemon, the core file will be in root's home directory and readable only by root. For a DPCL daemon, the core file will be written to your home directory. In both cases, the core file will be written in the directory on the system where the DPCL daemons were loading or creating an application process. note: The DPCL daemon core files may be large, on the order of several hundred MB, and may be truncated if you do not have sufficient free space in your home directory. DPCL super daemon-generated core files are much smaller.

If you have a core file, then cd to the directory where the core file resides.

For a DPCL super daemon termination, invoke the debugger, specifying the path to the DPCL super daemon. For AIX, the command is dbx /usr/lpp/ppe.dpcl/dpcl/bin/dpclSD. For Linux, the command is gdb /opt/dpcl/bin/dpclSD.

For a DPCL daemon termination, invoke the debugger, specifying the path to the DPCL daemon. For AIX, the command is dbx /usr/lpp/ppe.dpcl/bin/dpcld. For Linux, the command is gdb /opt/dpcl/bin/dpcld. In the case of AIX, dbx may have problems handling shared objects loaded by DPCL, and may issue a message stating it cannot load a file in /tmp with an apparently random filename. The workaround for this is to establish a symbolic link (ln -s) to some other executable on the system using the filename displayed by the dbx error message as the new filename the executable is linked to, then reissue the dbx command.

The debugger may issue a number of messages stating that addresses are unreadable. Ignore those error messages and wait for a dbx or gdb command prompt. Once the command prompt appears, issue a where command to display the call traceback from the core file and note the call stack. The DPCL function closest to the top of the call stack is a likely candidate for further investigation. Note that the debugger will allow you to only examine the call stack, variables passed as parameters, and global variables. Malloc'ed variables and variables local to a function will not be visible.

If examining the core file does not provide useful information, the next step is to attempt to attach the debugger to a running DPCL super daemon or DPCL daemon. In the case of a failure during initial DPCL startup, the DPCL super daemon and DPCL daemon are both possible candidates. If the failure occurs processing subsequent DPCL requests, then only the DPCL daemon is a candidate.

One way to attach a debugger to a DPCL daemon requires a small modification to the DPCL source and rebuilding DPCL. The modification is to add a call to the sleep() function at entry to main() in either the DPCL super daemon or DPCL daemon. The main() function is located in dpcl/src/SD/src/SdServer.C for the DPCL super daemon and dpcl/src/daemon/src/main.C for the DPCL daemon. The sleep function should be called with a delay of at least 30 seconds to allow you to attach to daemon process once the DPCL client has been started. The actual duration is unimportant since you will resume daemon execution once the debugger is attached.

Once the code is modified, cd to the dpcl/src directory and run make. The DPCL super daemon will be built as dpcl/src/SD/src/dpclSD and the DPCL daemon will be built as dpcl/src/daemon/src/dpcld. Then copy them to the locations they will be run from. Since the code has been modified to sleep upon invocation, this will affect other DPCL users on the system. If there are other DPCL users, you should consider setting up an alternate DPCL execution environment as described here. Otherwise, you can use the standard DPCL execution environment. In this case, copy the DPCL super daemon to /usr/lpp/ppe.dpcl/bin/dpclSD and the DPCL daemon to /usr/lpp/ppe.dpcl/bin/dpcld on AIX systems. For Linux, copy the DPCL super daemon to /opt/dpcl/bin/dpclSD and the DPCL daemon to /opt/dpcl/bin/dpcld.

Once you have copied the DPCL code and set up the execution environment (if you are running a private DPCL copy), invoke your DPCL client. The client will start the DPCL daemons, at which time the sleep() function you coded will be executed and DPCL execution will be suspended. Now you need to attach to the suspended daemon process. If you are attaching to a DPCL super daemon executed in the product DPCL execution environment, note this is a root process, and you must be running as root to attach to the process. Otherwise, you can attach to the daemons using your normal login.

The first step is to determine the pid of the daemon you want to attach to by use of the ps command, for instance ps -u <userid> or ps -ef | grep dpcl and noting the relevant pids. Note that for Linux, the DPCL daemon runs multiple threads, which will appear as multiple processes due to the way Linux implements threads. In this case, choose the lowest numbered pid, which will be the main DPCL daemon thread. All pids should be consecutively numbered, or nearly so, depending on system activity.

Once you have the pid, you attach to the process by issuing dbx -a <daemon_pid> on an AIX system, or by issuing gdb <daemon_path> <daemon_pid> and wait for the command prompt. At the command prompt, you can set breakpoints if you like, or just issue a continue (c) command and let the daemon run to the point of abnormal termination. In the case of an ASC_communication_failure status, the most likely outcome will be that the debugger gets control due to a signal such as SIGSEGV (segmentation violation) or a trap raised by a failed assert or C++ exception.

At the point the debugger regains control, you can display the call traceback, as well as any variables visible to the code at that point in execution. Use the where command to display the call traceback, or the p command to display variables.

If you cannot resolve the problem with these steps, then report the problem on the dpcl-user mailing list.

Building a Private Copy of DPCL:

A private copy of DPCL may be helpful if you are debugging a DPCL problem or making changes to DPCL code. This section describes how to set up and use a private copy of DPCL. A private copy should be used only for debugging or development work, and not for regular production use.

The first step in setting up a private DPCL environment is to choose a port number and service name that your DPCL client will use to connect to DPCL daemons. You should choose a port number and service name which do not conflict with any other service described in /etc/services.

The next step is to update the necessary system files. This requires root access.

For both AIX and Linux, update /etc/services to define the new service name and port. The simplest way to do this is to copy the existing entry for the dpclSD service, and change the service name and port number for the new service.

For AIX, the second step is to modify the /etc/inetd.conf file. Start by duplicating the entry for the dpclSD service which should look similar to the following:

dpclSD stream tcp nowait root /etc/dpclSD dpclSD /etc/dpcld /tmp/dpclSD01 /tmp/dpclsd

Make the following changes to the duplicated entry:

Save inetd.conf and reconfigure inetd, either using options in the SMIT panels or by issuing the refresh -s inetd command.

For Linux, you will create a new file in the /etc/xinetd.d directory, using the selected service name as its filename. The simplest way to do this is to copy /etc/xinetd.d/dpclSD to the new filename then edit the new file.

Once you have made the copy, make the following modifications to the copy:

Since xinetd configuration procedures vary slightly depending on system level, check the xinetd man page to determine how to reconfigure xinetd on your system.

Once you have made the changes to these files, copy your DPCL super daemon and DPCL daemon to their specified locations. Also copy your copies of DPCL libraries (libdpcl.a for AIX, libdpcl.so, libBPatch.so, libibmBPatch.so, and libdpclRT.so) to a directory.

To run a DPCL client using your private DPCL copy, set the DPCLSD environment variable to the name you selected for your DPCL service. Set your library path, LIBPATH for AIX, LD_LIBRARY_PATH for Linux) to the directory where you copied DPCL libraries. On AIX, you may need to run the slibclean command (as root) before running your private DPCL version, in order to force libdpcl.a to be unloaded from memory. You may also want to change the permissions on libdpcl.a to 0600 (read/execute only by owner) to prevent it from becoming locked into storage.

Once you have completed these steps, run your DPCL client per usual procedures.