| Copyright 1994, 1995, 1996, 1999, 2000, 2001, 2002 Free Software |
| Foundation, Inc. |
| |
| This file is free documentation; the Free Software Foundation gives |
| unlimited permission to copy, distribute and modify it. |
| |
| |
| Perftools-Specific Install Notes |
| ================================ |
| |
| *** Building from source repository |
| |
| As of 2.1 gperftools does not have configure and other autotools |
| products checked into it's source repository. This is common practice |
| for projects using autotools. |
| |
| NOTE: Source releases (.tar.gz that you download from |
| code.google.com/p/gperftools) still have all required files just as |
| before. Nothing has changed w.r.t. building from .tar.gz releases. |
| |
| But, in order to build gperftools checked out from subversion |
| repository you need to have autoconf, automake and libtool |
| installed. And before running ./configure you have to generate it (and |
| a bunch of other files) by running ./autogen.sh script. That script |
| will take care of calling correct autotools programs in correct order. |
| |
| If you're maintainer then it's business as usual too. Just run make |
| dist (or, preferably, make distcheck) and it'll produce .tar.gz or |
| .tar.bz2 with all autotools magic already included. So that users can |
| build our software without having autotools. |
| |
| |
| *** NOTE FOR 64-BIT LINUX SYSTEMS |
| |
| The glibc built-in stack-unwinder on 64-bit systems has some problems |
| with the perftools libraries. (In particular, the cpu/heap profiler |
| may be in the middle of malloc, holding some malloc-related locks when |
| they invoke the stack unwinder. The built-in stack unwinder may call |
| malloc recursively, which may require the thread to acquire a lock it |
| already holds: deadlock.) |
| |
| For that reason, if you use a 64-bit system, we strongly recommend you |
| install libunwind before trying to configure or install gperftools. |
| libunwind can be found at |
| |
| http://download.savannah.gnu.org/releases/libunwind/libunwind-0.99-beta.tar.gz |
| |
| Even if you already have libunwind installed, you should check the |
| version. Versions older than this will not work properly; too-new |
| versions introduce new code that does not work well with perftools |
| (because libunwind can call malloc, which will lead to deadlock). |
| |
| There have been reports of crashes with libunwind 0.99 (see |
| http://code.google.com/p/gperftools/issues/detail?id=374). |
| Alternately, you can use a more recent libunwind (e.g. 1.0.1) at the |
| cost of adding a bit of boilerplate to your code. For details, see |
| http://groups.google.com/group/google-perftools/msg/2686d9f24ac4365f |
| |
| CAUTION: if you install libunwind from the url above, be aware that |
| you may have trouble if you try to statically link your binary with |
| perftools: that is, if you link with 'gcc -static -lgcc_eh ...'. |
| This is because both libunwind and libgcc implement the same C++ |
| exception handling APIs, but they implement them differently on |
| some platforms. This is not likely to be a problem on ia64, but |
| may be on x86-64. |
| |
| Also, if you link binaries statically, make sure that you add |
| -Wl,--eh-frame-hdr to your linker options. This is required so that |
| libunwind can find the information generated by the compiler |
| required for stack unwinding. |
| |
| Using -static is rare, though, so unless you know this will affect |
| you it probably won't. |
| |
| If you cannot or do not wish to install libunwind, you can still try |
| to use the built-in stack unwinder. The built-in stack unwinder |
| requires that your application, the tcmalloc library, and system |
| libraries like libc, all be compiled with a frame pointer. This is |
| *not* the default for x86-64. |
| |
| If you are on x86-64 system, know that you have a set of system |
| libraries with frame-pointers enabled, and compile all your |
| applications with -fno-omit-frame-pointer, then you can enable the |
| built-in perftools stack unwinder by passing the |
| --enable-frame-pointers flag to configure. |
| |
| Even with the use of libunwind, there are still known problems with |
| stack unwinding on 64-bit systems, particularly x86-64. See the |
| "64-BIT ISSUES" section in README. |
| |
| If you encounter problems, try compiling perftools with './configure |
| --enable-frame-pointers'. Note you will need to compile your |
| application with frame pointers (via 'gcc -fno-omit-frame-pointer |
| ...') in this case. |
| |
| |
| *** TCMALLOC LARGE PAGES: TRADING TIME FOR SPACE |
| |
| You can set a compiler directive that makes tcmalloc faster, at the |
| cost of using more space (due to internal fragmentation). |
| |
| Internally, tcmalloc divides its memory into "pages." The default |
| page size is chosen to minimize memory use by reducing fragmentation. |
| The cost is that keeping track of these pages can cost tcmalloc time. |
| We've added a new flag to tcmalloc that enables a larger page size. |
| In general, this will increase the memory needs of applications using |
| tcmalloc. However, in many cases it will speed up the applications |
| as well, particularly if they allocate and free a lot of memory. We've |
| seen average speedups of 3-5% on Google applications. |
| |
| To build libtcmalloc with large pages you need to use the |
| --with-tcmalloc-pagesize=ARG configure flag, e.g.: |
| |
| ./configure <other flags> --with-tcmalloc-pagesize=32 |
| |
| The ARG argument can be 4, 8, 16, 32, 64, 128 or 256 which sets the |
| internal page size to 4K, 8K, 16K, 32K, 64K, 128K and 256K respectively. |
| The default is 8K. |
| |
| |
| *** SMALL TCMALLOC CACHES: TRADING SPACE FOR TIME |
| |
| You can set a compiler directive that makes tcmalloc use less memory |
| for overhead, at the cost of some time. |
| |
| Internally, tcmalloc keeps information about some of its internal data |
| structures in a cache. This speeds memory operations that need to |
| access this internal data. We've added a new, experimental flag to |
| tcmalloc that reduces the size of this cache, decresaing the memory |
| needs of applications using tcmalloc. |
| |
| This feature is still very experimental; it's not even a configure |
| flag yet. To build libtcmalloc with smaller internal caches, run |
| |
| ./configure <normal flags> CXXFLAGS=-DTCMALLOC_SMALL_BUT_SLOW |
| |
| (or add -DTCMALLOC_SMALL_BUT_SLOW to your existing CXXFLAGS argument). |
| |
| |
| *** NOTE FOR ___tls_get_addr ERROR |
| |
| When compiling perftools on some old systems, like RedHat 8, you may |
| get an error like this: |
| ___tls_get_addr: symbol not found |
| |
| This means that you have a system where some parts are updated enough |
| to support Thread Local Storage, but others are not. The perftools |
| configure script can't always detect this kind of case, leading to |
| that error. To fix it, just comment out the line |
| #define HAVE_TLS 1 |
| in your config.h file before building. |
| |
| |
| *** TCMALLOC AND DLOPEN |
| |
| To improve performance, we use the "initial exec" model of Thread |
| Local Storage in tcmalloc. The price for this is the library will not |
| work correctly if it is loaded via dlopen(). This should not be a |
| problem, since loading a malloc-replacement library via dlopen is |
| asking for trouble in any case: some data will be allocated with one |
| malloc, some with another. If, for some reason, you *do* need to use |
| dlopen on tcmalloc, the easiest way is to use a version of tcmalloc |
| with TLS turned off; see the ___tls_get_addr note above. |
| |
| |
| *** COMPILING ON NON-LINUX SYSTEMS |
| |
| Perftools has been tested on the following systems: |
| FreeBSD 6.0 (x86) |
| FreeBSD 8.1 (x86_64) |
| Linux CentOS 5.5 (x86_64) |
| Linux Debian 4.0 (PPC) |
| Linux Debian 5.0 (x86) |
| Linux Fedora Core 3 (x86) |
| Linux Fedora Core 4 (x86) |
| Linux Fedora Core 5 (x86) |
| Linux Fedora Core 6 (x86) |
| Linux Fedora Core 13 (x86_64) |
| Linux Fedora Core 14 (x86_64) |
| Linux RedHat 9 (x86) |
| Linux Slackware 13 (x86_64) |
| Linux Ubuntu 6.06.1 (x86) |
| Linux Ubuntu 6.06.1 (x86_64) |
| Linux Ubuntu 10.04 (x86) |
| Linux Ubuntu 10.10 (x86_64) |
| Mac OS X 10.3.9 (Panther) (PowerPC) |
| Mac OS X 10.4.8 (Tiger) (PowerPC) |
| Mac OS X 10.4.8 (Tiger) (x86) |
| Mac OS X 10.5 (Leopard) (x86) |
| Mac OS X 10.6 (Snow Leopard) (x86) |
| Solaris 10 (x86_64) |
| Windows XP, Visual Studio 2003 (VC++ 7.1) (x86) |
| Windows XP, Visual Studio 2005 (VC++ 8) (x86) |
| Windows XP, Visual Studio 2005 (VC++ 9) (x86) |
| Windows XP, Visual Studio 2005 (VC++ 10) (x86) |
| Windows XP, MinGW 5.1.3 (x86) |
| Windows XP, Cygwin 5.1 (x86) |
| |
| It works in its full generality on the Linux systems |
| tested (though see 64-bit notes above). Portions of perftools work on |
| the other systems. The basic memory-allocation library, |
| tcmalloc_minimal, works on all systems. The cpu-profiler also works |
| fairly widely. However, the heap-profiler and heap-checker are not |
| yet as widely supported. In general, the 'configure' script will |
| detect what OS you are building for, and only build the components |
| that work on that OS. |
| |
| Note that tcmalloc_minimal is perfectly usable as a malloc/new |
| replacement, so it is possible to use tcmalloc on all the systems |
| above, by linking in libtcmalloc_minimal. |
| |
| ** FreeBSD: |
| |
| The following binaries build and run successfully (creating |
| libtcmalloc_minimal.so and libprofile.so in the process): |
| % ./configure |
| % make tcmalloc_minimal_unittest tcmalloc_minimal_large_unittest \ |
| addressmap_unittest atomicops_unittest frag_unittest \ |
| low_level_alloc_unittest markidle_unittest memalign_unittest \ |
| packed_cache_test stacktrace_unittest system_alloc_unittest \ |
| thread_dealloc_unittest profiler_unittest.sh |
| % ./tcmalloc_minimal_unittest # to run this test |
| % [etc] # to run other tests |
| |
| Three caveats: first, frag_unittest tries to allocate 400M of memory, |
| and if you have less virtual memory on your system, the test may |
| fail with a bad_alloc exception. |
| |
| Second, profiler_unittest.sh sometimes fails in the "fork" test. |
| This is because stray SIGPROF signals from the parent process are |
| making their way into the child process. (This may be a kernel |
| bug that only exists in older kernels.) The profiling code itself |
| is working fine. This only affects programs that call fork(); for |
| most programs, the cpu profiler is entirely safe to use. |
| |
| Third, perftools depends on /proc to get shared library |
| information. If you are running a FreeBSD system without proc, |
| perftools will not be able to map addresses to functions. Some |
| unittests will fail as a result. |
| |
| Finally, the new test introduced in perftools-1.2, |
| profile_handler_unittest, fails on FreeBSD. It has something to do |
| with how the itimer works. The cpu profiler test passes, so I |
| believe the functionality is correct and the issue is with the test |
| somehow. If anybody is an expert on itimers and SIGPROF in |
| FreeBSD, and would like to debug this, I'd be glad to hear the |
| results! |
| |
| libtcmalloc.so successfully builds, and the "advanced" tcmalloc |
| functionality all works except for the leak-checker, which has |
| Linux-specific code: |
| % make heap-profiler_unittest.sh maybe_threads_unittest.sh \ |
| tcmalloc_unittest tcmalloc_both_unittest \ |
| tcmalloc_large_unittest # THESE WORK |
| % make -k heap-checker_unittest.sh \ |
| heap-checker-death_unittest.sh # THESE DO NOT |
| |
| Note that unless you specify --enable-heap-checker explicitly, |
| 'make' will not build the heap-checker unittests on a FreeBSD |
| system. |
| |
| I have not tested other *BSD systems, but they are probably similar. |
| |
| ** Mac OS X: |
| |
| I've tested OS X 10.5 [Leopard], OS X 10.4 [Tiger] and OS X 10.3 |
| [Panther] on both intel (x86) and PowerPC systems. For Panther |
| systems, perftools does not work at all: it depends on a header |
| file, OSAtomic.h, which is new in 10.4. (It's possible to get the |
| code working for Panther/i386 without too much work; if you're |
| interested in exploring this, drop an e-mail.) |
| |
| For the other seven systems, the binaries and libraries that |
| successfully build are exactly the same as for FreeBSD. See that |
| section for a list of binaries and instructions on building them. |
| |
| In addition, it appears OS X regularly fails profiler_unittest.sh |
| in the "thread" test (in addition to occassionally failing in the |
| "fork" test). It looks like OS X often delivers the profiling |
| signal to the main thread, even when it's sleeping, rather than |
| spawned threads that are doing actual work. If anyone knows |
| details of how OS X handles SIGPROF (via setitimer()) events with |
| threads, and has insight into this problem, please send mail to |
| google-perftools@googlegroups.com. |
| |
| ** Solaris 10 x86: |
| |
| I've only tested using the GNU C++ compiler, not the Sun C++ |
| compiler. Using g++ requires setting the PATH appropriately when |
| configuring. |
| |
| % PATH=${PATH}:/usr/sfw/bin/:/usr/ccs/bin ./configure |
| % PATH=${PATH}:/usr/sfw/bin/:/usr/ccs/bin make [...] |
| |
| Again, the binaries and libraries that successfully build are |
| exactly the same as for FreeBSD. (However, while libprofiler.so can |
| be used to generate profiles, pprof is not very successful at |
| reading them -- necessary helper programs like nm don't seem |
| to be installed by default on Solaris, or perhaps are only |
| installed as part of the Sun C++ compiler package.) See that |
| section for a list of binaries, and instructions on building them. |
| |
| ** Windows (MSVC, Cygwin, and MinGW): |
| |
| Work on Windows is rather preliminary: only tcmalloc_minimal is |
| supported. |
| |
| We haven't found a good way to get stack traces in release mode on |
| windows (that is, when FPO is enabled), so the heap profiling may |
| not be reliable in that case. Also, heap-checking and CPU profiling |
| do not yet work at all. But as in other ports, the basic tcmalloc |
| library functionality, overriding malloc and new and such (and even |
| windows-specific functions like _aligned_malloc!), is working fine, |
| at least with VC++ 7.1 (Visual Studio 2003) through VC++ 10.0, |
| in both debug and release modes. See README.windows for |
| instructions on how to install on Windows using Visual Studio. |
| |
| Cygwin can compile some but not all of perftools. Furthermore, |
| there is a problem with exception-unwinding in cygwin (it can call |
| malloc, which can call the exception-unwinding-setup code, which |
| can lead to an infinite loop). I've comitted a workaround to the |
| exception unwinding problem, but it only works in debug mode and |
| when statically linking in tcmalloc. I hope to have a more proper |
| fix in a later release. To configure under cygwin, run |
| |
| ./configure --disable-shared CXXFLAGS=-g && make |
| |
| Most of cygwin will compile (cygwin doesn't allow weak symbols, so |
| the heap-checker and a few other pieces of functionality will not |
| compile). 'make' will compile those libraries and tests that can |
| be compiled. You can run 'make check' to make sure the basic |
| functionality is working. I've heard reports that some versions of |
| cygwin fail calls to pthread_join() with EINVAL, causing several |
| tests to fail. If you have any insight into this, please mail |
| google-perftools@googlegroups.com. |
| |
| This Windows functionality is also available using MinGW and Msys, |
| In this case, you can use the regular './configure && make' |
| process. 'make install' should also work. The Makefile will limit |
| itself to those libraries and binaries that work on windows. |
| |
| |
| Basic Installation |
| ================== |
| |
| These are generic installation instructions. |
| |
| The `configure' shell script attempts to guess correct values for |
| various system-dependent variables used during compilation. It uses |
| those values to create a `Makefile' in each directory of the package. |
| It may also create one or more `.h' files containing system-dependent |
| definitions. Finally, it creates a shell script `config.status' that |
| you can run in the future to recreate the current configuration, and a |
| file `config.log' containing compiler output (useful mainly for |
| debugging `configure'). |
| |
| It can also use an optional file (typically called `config.cache' |
| and enabled with `--cache-file=config.cache' or simply `-C') that saves |
| the results of its tests to speed up reconfiguring. (Caching is |
| disabled by default to prevent problems with accidental use of stale |
| cache files.) |
| |
| If you need to do unusual things to compile the package, please try |
| to figure out how `configure' could check whether to do them, and mail |
| diffs or instructions to the address given in the `README' so they can |
| be considered for the next release. If you are using the cache, and at |
| some point `config.cache' contains results you don't want to keep, you |
| may remove or edit it. |
| |
| The file `configure.ac' (or `configure.in') is used to create |
| `configure' by a program called `autoconf'. You only need |
| `configure.ac' if you want to change it or regenerate `configure' using |
| a newer version of `autoconf'. |
| |
| The simplest way to compile this package is: |
| |
| 1. `cd' to the directory containing the package's source code and type |
| `./configure' to configure the package for your system. If you're |
| using `csh' on an old version of System V, you might need to type |
| `sh ./configure' instead to prevent `csh' from trying to execute |
| `configure' itself. |
| |
| Running `configure' takes awhile. While running, it prints some |
| messages telling which features it is checking for. |
| |
| 2. Type `make' to compile the package. |
| |
| 3. Optionally, type `make check' to run any self-tests that come with |
| the package. |
| |
| 4. Type `make install' to install the programs and any data files and |
| documentation. |
| |
| 5. You can remove the program binaries and object files from the |
| source code directory by typing `make clean'. To also remove the |
| files that `configure' created (so you can compile the package for |
| a different kind of computer), type `make distclean'. There is |
| also a `make maintainer-clean' target, but that is intended mainly |
| for the package's developers. If you use it, you may have to get |
| all sorts of other programs in order to regenerate files that came |
| with the distribution. |
| |
| Compilers and Options |
| ===================== |
| |
| Some systems require unusual options for compilation or linking that |
| the `configure' script does not know about. Run `./configure --help' |
| for details on some of the pertinent environment variables. |
| |
| You can give `configure' initial values for configuration parameters |
| by setting variables in the command line or in the environment. Here |
| is an example: |
| |
| ./configure CC=c89 CFLAGS=-O2 LIBS=-lposix |
| |
| *Note Defining Variables::, for more details. |
| |
| Compiling For Multiple Architectures |
| ==================================== |
| |
| You can compile the package for more than one kind of computer at the |
| same time, by placing the object files for each architecture in their |
| own directory. To do this, you must use a version of `make' that |
| supports the `VPATH' variable, such as GNU `make'. `cd' to the |
| directory where you want the object files and executables to go and run |
| the `configure' script. `configure' automatically checks for the |
| source code in the directory that `configure' is in and in `..'. |
| |
| If you have to use a `make' that does not support the `VPATH' |
| variable, you have to compile the package for one architecture at a |
| time in the source code directory. After you have installed the |
| package for one architecture, use `make distclean' before reconfiguring |
| for another architecture. |
| |
| Installation Names |
| ================== |
| |
| By default, `make install' will install the package's files in |
| `/usr/local/bin', `/usr/local/man', etc. You can specify an |
| installation prefix other than `/usr/local' by giving `configure' the |
| option `--prefix=PATH'. |
| |
| You can specify separate installation prefixes for |
| architecture-specific files and architecture-independent files. If you |
| give `configure' the option `--exec-prefix=PATH', the package will use |
| PATH as the prefix for installing programs and libraries. |
| Documentation and other data files will still use the regular prefix. |
| |
| In addition, if you use an unusual directory layout you can give |
| options like `--bindir=PATH' to specify different values for particular |
| kinds of files. Run `configure --help' for a list of the directories |
| you can set and what kinds of files go in them. |
| |
| If the package supports it, you can cause programs to be installed |
| with an extra prefix or suffix on their names by giving `configure' the |
| option `--program-prefix=PREFIX' or `--program-suffix=SUFFIX'. |
| |
| Optional Features |
| ================= |
| |
| Some packages pay attention to `--enable-FEATURE' options to |
| `configure', where FEATURE indicates an optional part of the package. |
| They may also pay attention to `--with-PACKAGE' options, where PACKAGE |
| is something like `gnu-as' or `x' (for the X Window System). The |
| `README' should mention any `--enable-' and `--with-' options that the |
| package recognizes. |
| |
| For packages that use the X Window System, `configure' can usually |
| find the X include and library files automatically, but if it doesn't, |
| you can use the `configure' options `--x-includes=DIR' and |
| `--x-libraries=DIR' to specify their locations. |
| |
| Specifying the System Type |
| ========================== |
| |
| There may be some features `configure' cannot figure out |
| automatically, but needs to determine by the type of machine the package |
| will run on. Usually, assuming the package is built to be run on the |
| _same_ architectures, `configure' can figure that out, but if it prints |
| a message saying it cannot guess the machine type, give it the |
| `--build=TYPE' option. TYPE can either be a short name for the system |
| type, such as `sun4', or a canonical name which has the form: |
| |
| CPU-COMPANY-SYSTEM |
| |
| where SYSTEM can have one of these forms: |
| |
| OS KERNEL-OS |
| |
| See the file `config.sub' for the possible values of each field. If |
| `config.sub' isn't included in this package, then this package doesn't |
| need to know the machine type. |
| |
| If you are _building_ compiler tools for cross-compiling, you should |
| use the `--target=TYPE' option to select the type of system they will |
| produce code for. |
| |
| If you want to _use_ a cross compiler, that generates code for a |
| platform different from the build platform, you should specify the |
| "host" platform (i.e., that on which the generated programs will |
| eventually be run) with `--host=TYPE'. |
| |
| Sharing Defaults |
| ================ |
| |
| If you want to set default values for `configure' scripts to share, |
| you can create a site shell script called `config.site' that gives |
| default values for variables like `CC', `cache_file', and `prefix'. |
| `configure' looks for `PREFIX/share/config.site' if it exists, then |
| `PREFIX/etc/config.site' if it exists. Or, you can set the |
| `CONFIG_SITE' environment variable to the location of the site script. |
| A warning: not all `configure' scripts look for a site script. |
| |
| Defining Variables |
| ================== |
| |
| Variables not defined in a site shell script can be set in the |
| environment passed to `configure'. However, some packages may run |
| configure again during the build, and the customized values of these |
| variables may be lost. In order to avoid this problem, you should set |
| them in the `configure' command line, using `VAR=value'. For example: |
| |
| ./configure CC=/usr/local2/bin/gcc |
| |
| will cause the specified gcc to be used as the C compiler (unless it is |
| overridden in the site shell script). |
| |
| `configure' Invocation |
| ====================== |
| |
| `configure' recognizes the following options to control how it |
| operates. |
| |
| `--help' |
| `-h' |
| Print a summary of the options to `configure', and exit. |
| |
| `--version' |
| `-V' |
| Print the version of Autoconf used to generate the `configure' |
| script, and exit. |
| |
| `--cache-file=FILE' |
| Enable the cache: use and save the results of the tests in FILE, |
| traditionally `config.cache'. FILE defaults to `/dev/null' to |
| disable caching. |
| |
| `--config-cache' |
| `-C' |
| Alias for `--cache-file=config.cache'. |
| |
| `--quiet' |
| `--silent' |
| `-q' |
| Do not print messages saying which checks are being made. To |
| suppress all normal output, redirect it to `/dev/null' (any error |
| messages will still be shown). |
| |
| `--srcdir=DIR' |
| Look for the package's source code in directory DIR. Usually |
| `configure' can determine that directory automatically. |
| |
| `configure' also accepts some other, not widely useful, options. Run |
| `configure --help' for more details. |