blob: 245e1af3aa845bf4026ec3d903f91009b5eb2bf5 [file] [log] [blame]
/* **********************************************************
* Copyright (c) 2011-2024 Google, Inc. All rights reserved.
* Copyright (c) 2010 VMware, Inc. All rights reserved.
* **********************************************************/
/*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* * Redistributions of source code must retain the above copyright notice,
* this list of conditions and the following disclaimer.
*
* * Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and/or other materials provided with the distribution.
*
* * Neither the name of VMware, Inc. nor the names of its contributors may be
* used to endorse or promote products derived from this software without
* specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL VMWARE, INC. OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
* SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
* CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
* DAMAGE.
*/
/**
***************************************************************************
***************************************************************************
\page page_drsyms Symbol Access Library
The \p drsyms DynamoRIO Extension provides symbol information. Currently drsyms
supports reading symbol information from Windows PDB files or Linux ELF,
Mac Mach-O, or Windows PECOFF files with DWARF2 line information.
- \ref sec_drsyms_setup
- \ref sec_drsyms_paths
- \ref sec_drsyms_exports
- \ref sec_drsyms_api
- \ref sec_drsyms_mem
- \ref sec_drsyms_modbase
\section sec_drsyms_setup Setup
To use \p drsyms with your client simply include this line in your client's
\p CMakeLists.txt file:
\code use_DynamoRIO_extension(clientname drsyms) \endcode
That will automatically set up the include path and library dependence.
The \p drsyms library on Windows relies on the \p dbghelp.dll library from
Microsoft. A copy of the \p dbghelp.dll library is included.
You can also download the Debugging Tools for Windows
package from http://www.microsoft.com/whdc/devtools/debugging/default.mspx
and place the \p dbghelp.dll in the same directory as either \p drsyms.dll
or as your client library. Version 6.0 or higher of \p dbghelp.dll is
required, and 6.3 or higher is recommended. Recent versions of Windows do
have these versions of \p dbghelp.dll in their system directories, but be
aware that those versions are not redistributable.
The 64-bit version 6.5.0003.7 of \p dbghelp.dll (the one included with
Visual Studio 2005) has a bug that can cause crashes. We recommend
avoiding this version.
More recent versions of \p dbghelp.dll on Windows can use significant
amounts of stack space. We have observed usage of 36KB. This is within
the default stack size for a DynamoRIO client, but be aware if trying to
trim the stack size via the DynamoRIO runtime option <tt>-stack_size</tt>
that anything lower than 36KB is likely to be problematic on Windows.
On Windows, \p drsyms does support Cygwin and MinGW symbols, and will find
file and line information if in DWARF2 format. The \p stabs format is not
supported. Cygwin and MinGW gcc versions prior to 4.3 use \p stabs by
default; pass the \p -ggdb flag to request DWARF2.
The \p drsyms library on Linux and Mac uses bundled copies of \p libelf, \p libdwarf,
and \p libelftc built from the
<a href="http://elftoolchain.sourceforge.net">elftoolchain</a> project and
requires no setup.
\section sec_drsyms_paths Search Paths
On Linux, \p drsyms will look in the default debug directories for symbols
for a library: a <tt>.debug</tt> subdirectory of the library's directory or
in <tt>/usr/lib/debug</tt>.
On Mac, for binary <tt>foo</tt>, \p drsyms will look in the default
<tt>dsymutil</tt> path of <tt>foo.dSYM/Contents/Resources/DWARF/foo</tt>
for symbols.
On Windows, the <tt>_NT_SYMBOL_PATH</tt> environment variable is honored by
\p drsyms as a local cache of \p pdb files. However, \p drsyms does not
support symbol store paths (those that contain \p srv) when used inside of
a DynamoRIO client. Such paths should work fine when used in standalone
applications, provided that both \p symsrv.dll and \p dbghelp.dll are
locatable by the Windows loader.
\section sec_drsyms_exports Exported Functions
For clients interested only in locating specific functions exported from a
library, it is not necessary to use \p drsyms as the core DynamoRIO API
provides functions for iterating modules and looking up module exports.
The following core DynamoRIO API functions are relevant:
- dr_get_proc_address()
- dr_get_application_name()
- dr_register_module_load_event()
- dr_lookup_module()
- dr_lookup_module_by_name()
- dr_module_iterator_start()
These core API routines can be more efficient to use than the \p drsyms
routines, as the latter must locate, load, and parse debug information.
However, the core API provides no mechanism for iterating over all exported
functions: for that, \p drsyms must be used. The \p drsyms API will
operate even when no debug information is available, in which case only
exported functions will be considered.
On Linux, \p drsyms searches both \p .dynsym and \p .symtab, while
dr_get_proc_address() only searches \p .dynsym. Global symbols in an
executable (i.e., a non-library) are only present in \p .symtab by
default. If the executable is built with the \p -rdynamic flag to \p gcc,
its global symbols will be placed in \p .dynsym and thus
dr_get_proc_address() will find them. Regardless of how it was built, if
it's not stripped, \p drsyms will find the global symbols.
\section sec_drsyms_api API
All functions return a success code of type #drsym_error_t.
Prior to use, \p drsyms must be initialized by a call to drsym_init().
The \p drsyms API will eventually support both sideline and online use, and
the parameter to drsym_init() will specify the symbol server to use for
sideline use. Today only online use is supported and \p NULL should be
passed.
Symbol lookup is supported in both directions: from an address to a symbol
via drsym_lookup_address(), and from a symbol to an address via
drsym_lookup_symbol(). All symbols in a given module can be enumerated via
drsym_enumerate_symbols(), though on Windows using drsym_search_symbols()
for a particular match where a non-full search is not required (i.e., the
search is only targeting function symbols) is significantly faster and uses
less memory than a full enumeration. In fact, drsym_search_symbols() is
usually faster than drsym_lookup_symbol().
For C++ applications, each routine that handles symbols accepts a \p flags
argument that controls how or whether C++ symbols are demangled or undecorated.
Currently there are three modes:
- \p DRSYM_LEAVE_MANGLED: Matches against or returns the mangled C++ symbol.
- \p DRSYM_DEMANGLE: Matches against or returns a "short" demangled C++ symbol.
Templates and are collapsed to <> while parameters are omitted entirely
without parentheses.
- \p DRSYM_DEMANGLE_FULL: Matches against or returns the fully demangled C++
symbol with both template arguments and parameter types.
- \p DRSYM_DEMANGLE_PDB_TEMPLATES: Only applies to Windows PDB symbols.
Does not affect matches, but returns symbols with parameters omitted
without parentheses and templates fully expanded.
- \p DRSYM_DEFAULT_FLAGS: This is equivalent to \p DRSYM_DEMANGLE.
On Windows, this functionality is reduced due to the limitations of \p
dbghelp.dll. For all routines except drsym_demangle_symbol(), the only
flag that the user passes that has any effect is
DRSYM_DEMANGLE_PDB_TEMPLATES: all symbols returned or searched either use
default demangling or additionally use DRSYM_DEMANGLE_PDB_TEMPLATES if
requested.
For an example of usage see the \p instrcalls sample client distributed with
DynamoRIO.
When finished with the library, call drsym_exit().
\subsection sec_drsyms_mem Memory Usage
When running large applications, loading debug information for all of their
modules can occupy a lot of memory, from hundreds of megabytes into the
gigabyte range. Use the drsym_free_resources() routine when finished with
a module to unload its debug information. Normally, a client will query
for symbols in the module load event, and then won't need symbols again
until perhaps a callstack walk later on. Thus, we recommend that a client
call drsym_free_resources() at the end of its module load event. Due to
fragmentation concerns, it is not easy for drsyms itself to perform
internal garbage collection at any high frequency.
\subsection sec_drsyms_modbase Module Bases
All \p drsyms functions operate on relative offsets from a module base,
rather than absolute addresses for any given mapping. These are offsets
from the in-memory mapping of the module. For Mach-O executables,
the module base is considered to be placed after any __PAGEZERO segment.
This is especially important for PIE executables on MacOS, where the
randomized shift is between the __PAGEZERO segment and the __TEXT segment.
*/