blob: 49be17244fb65dcbd5e95bca61b7b0474d7c4351 [file] [log] [blame]
2009-05-12 Torbjorn Granlund <>
* Version 4.3.1 released.
2009-05-11 Torbjorn Granlund <>
Bump version info.
2009-05-09 Torbjorn Granlund <>
* tests/mpz: Add MPZ_CHECK_FORMAT to many tests.
2009-05-07 Torbjorn Granlund <>
* mpn/x86/pentium4/sse2/mul_basecase.asm: Avoid L(ret), "ret" is
defined in x86-defs.m4.
2009-05-06 Torbjorn Granlund <>
* mpn/x86/p6/aors_n.asm: Use L() for labels.
* mpn/x86/pentium4/sse2/addmul_1.asm: Likewise.
* mpn/x86/pentium4/sse2/mul_1.asm: Likewise.
* mpn/x86/pentium4/sse2/mul_basecase.asm: Likewise.
* mpn/x86/pentium4/sse2/sqr_basecase.asm: Likewise.
* mpn/x86_64/lshift.asm: Likewise.
* mpn/x86_64/rshift.asm: Likewise.
* tests/cxx/ (point_string): Declare as extern "C" to
placate compilers that mangle variable names.
2009-05-04 Torbjorn Granlund <>
* tests/mpz/t-gcd.c: Generate operands that are multiple of each other.
2009-05-01 Torbjorn Granlund <>
* (__GMP_EXTERN_INLINE): Support for more systems.
(gmp_randinit_set): Add missing __GMP_DECLSPEC.
2009-04-28 Torbjorn Granlund <>
* mpn/generic/neg_n.c: New file.
* (gmp_mpn_functions): Add neg_n.
* mpn/asm-defs.m4 (define_mpn): Add neg_n.
* mpn/ (nodist_EXTRA_libmpn_la_SOURCES): Add neg_n.c.
* Handle mpn_neg_n properly.
* mpn/generic/toom_interpolate_7pts.c (divexact_2exp): Nailify.
* mpn/generic/gcdext.c: Change some MPN_NORMALIZE to
* mpn/generic/gcdext_lehmer.c: Likewise.
* mpn/generic/binvert.c: Remove own mpn_neg_n.
* tests/mpz/t-gcd.c: Add some MPZ_CHECK_FORMAT calls.
2009-04-27 Torbjorn Granlund <>
* mpn/ (TARG_DIST): Add minithres.
* mpn/generic/bdiv_dbm1c.c: Handle nails.
2009-04-26 Torbjorn Granlund <>
* config.guess: Recognize more POWER processor types.
2009-04-25 Torbjorn Granlund <>
* mpn/x86/pentium4/sse2/popcount.asm: Work around Apple reloc bug.
* mpn/x86/darwin.m4: Define symbol "DARWIN".
2009-04-19 Torbjorn Granlund <>
* mpn/generic/powm.c (mpn_redc_n): Use ASSERT_ALWAYS, not abort().
* mpn/generic/powm_sec.c: Likewise.
* mpn/powerpc64/aix.m4 (EXTERN_FUNC): New define. Add dummy variants
for other m4 files.
* mpn/powerpc64/mode64/divrem_1.asm: Use EXTERN_FUNC.
* mpn/powerpc64/mode64/divrem_1.asm: Likewise.
2009-04-16 Torbjorn Granlund <>
* mpn/x86_64/x86_64-defs.m4 (JUMPTABSECT): New define.
* mpn/x86_64/darwin.m4: Likewise.
* mpn/x86_64/sqr_basecase.asm: Rework switch code using JUMPTABSECT.
* mpn/x86/x86-defs.m4 (LEA): Get SIZE arguments right.
2009-04-14 Torbjorn Granlund <>
* Version 4.3.0 released.
* scanf/doscan.c (__gmp_doscan): Pad 3-operand scanf call with dummy
* scanf/sscanffuns.c (scan): Disable vsscanf variant for now.
2009-04-13 Torbjorn Granlund <>
* scanf/sscanffuns.c (scan): Rewrite to use stdarg.
* tests/mpz/t-root.c: Rewrite. Add unconditional gcc 4.3.2 tests.
2009-04-09 Torbjorn Granlund <>
* mpn/generic/powm.c: New file.
* mpn/generic/powlo.c: New file.
* mpn/generic/powm_sec.c: New file.
* (gmp_mpn_functions): List new functions.
2009-04-08 Torbjorn Granlund <>
* mpz/urandomm.c: Amend last fix.
2009-04-06 Torbjorn Granlund <>
* Support Sun cc for x86_64.
* mpz/urandomm.c: Handle operand overlap.
2009-03-11 Torbjorn Granlund <>
* (powerpc): Brave removing -Wa,-mppc64, in the hope that
GCC now passes the proper options.
2009-03-09 Torbjorn Granlund <>
* mpn/x86_64/divrem_1.asm: Add a nop to save a cycle in unnormalized
2009-03-05 Torbjorn Granlund <>
* ia64/gmp-mparam.h, arm/gmp-mparam.h, x86/p6/mmx/gmp-mparam.h,
pa32/hppa2_0/gmp-mparam.h sparc32/v9/gmp-mparam.h: Update.
2009-03-03 Torbjorn Granlund <>
* mpn/ia64/bdiv_dbm1c.asm: Accept/return carry.
2009-03-02 Torbjorn Granlund <>
* (64-bit sparc/solaris): Pass -xO3, not -O3 to solaris
system compiler.
2009-03-01 Torbjorn Granlund <>
* longlong.h (mips, powerpc): Provide assembly-free umul_ppmm for newer
2009-02-04 Torbjorn Granlund <>
* mpn/generic/redc_2.c: Remove code for testing and timing. Update
to current FSF header.
* mpn/generic/redc_1.c: Update to current FSF header.
2009-01-21 Torbjorn Granlund <>
* mpz/powm.c (redc): Remove.
(mpz_powm): Use mpn_redc_1 instead of redc.
* tests/mpz/t-powm.c: Rewrite reference code.
2009-01-18 Torbjorn Granlund <>
* tests/mpz: Increase reps for many tests.
* mpn/generic/rootrem.c (mpn_rootrem_internal): Use MPN_DECR_U instead of
mpn_sub_1 (works around gcc 4.3 bugs and is also faster).
2009-01-16 Torbjorn Granlund <>
* tests/tests.h: Declare refmpn_divrem_2.
2009-01-15 Torbjorn Granlund <>
* mpz/perfpow.c: Add TMP_FREE before every return statement.
* mpn/generic/rootrem.c (mpn_rootrem_internal): Add a missing TMP_FREE.
* (gcc_cflags, gcc_64_cflags): Revert from -O3 to -O2,
the change was accidental and cause too much miscompilation.
2009-01-14 Torbjorn Granlund <>
* tune/tuneup.c (tune_mod_1): Run MOD_1_x_THRESHOLD tests also when
longlong.h specified UDIV_PREINV_ALWAYS.
* mpn/generic/mod_1.c (mpn_mod_1): Properly check for normalization
2009-01-13 Torbjorn Granlund <>
* tune/tuneup.c (tune_mod_1): Tune for MOD_1_1_THRESHOLD,
* mpn/generic/mod_1.c: Rewrite.
* mpn/generic/mod_1_1.c: New file.
* mpn/generic/mod_1_2.c: New file.
* mpn/generic/mod_1_3.c: New file.
* mpn/generic/mod_1_4.c: New file.
* (gmp_mpn_functions): Add mod_1_*.
* mpn/asm-defs.m4 (define_mpn): Add mod_1_*.
* mpn/ (nodist_EXTRA_libmpn_la_SOURCES): Add mod_1_*.c.
* gmp-impl.h: Declare new mpn_mod_1s_* functions and associated
(udiv_rnd_preinv): New macro.
2009-01-12 Torbjorn Granlund <>
* tune/tuneup.c (tune_gcd_dc,tune_gcdext_dc): Lower step_factor to 0.1.
2009-01-08 Torbjorn Granlund <>
* tests/mpz/t-nextprime.c: New test file.
* tests/mpz/ (check_PROGRAMS): Add t-nextprime.
From Niels Möller:
* mpz/nextprime.c: Handle large prime gaps by limiting incr.
2009-01-04 Torbjorn Granlund <>
* mpz/and.c, mpz/ior.c, mpz/xor.c: Re-read only necessary source
pointers after reallocation. Misc cleanup.
* gmp-impl.h (MPN_TOOM44_MAX_N): New define, replaces MPN_TOOM3_MAX_N.
* mpn/x86/fat/diveby3.c: New file.
2008-12-30 Niels Möller <>
* doc/gmp.texi (Greatest Common Divisor Algorithms): Updated
section on GCD algorithms.
2008-12-29 Torbjorn Granlund <>
* doc/gmp.texi (Multiplication Algorithms): Add descriptions of Toom-4
and unbalanced multiplication.
(Radix to Binary): Add warning that text is outdated,
(Contributors): Fix typos.
* mpn/generic/toom*.c: Use coherent MAYBE_ macros for trimming
unreachable recursive functions.
* gmp-impl.h: Update toom itch functions.
* mpn/x86_64/sqr_basecase.asm: Slightly increase stack allocation, to
placate tuneup.
2008-12-28 Torbjorn Granlund <>
* mpn/x86_64/pentium4/aors_n.asm: Tune prologue code.
* mpn/x86_64/pentium4/aorslsh1_n.asm: New file.
* mpn/x86_64/darwin.m4: Define symbol "DARWIN".
* mpn/x86_64/invert_limb.asm: Work around darwin quirks.
* mpn/x86_64/sqr_basecase.asm: Further optimize, support Darwin.
* mpn/x86_64/invert_limb.asm: New file.
2008-12-27 Torbjorn Granlund <>
* mpn/x86_64/core2/aorslsh1_n.asm: New file.
2008-12-26 Torbjorn Granlund <>
* mpz/perfpow.c: Handle negative arguments properly.
* tests/mpz/t-perfpow.c: New file.
* tests/mpz/ (check_PROGRAMS): Add t-perfpow.
2008-12-23 Torbjorn Granlund <>
* tests/mpz/t-mul.c (dump_abort): Improve error message.
* gcd.c gcd_subdiv_step.c gcdext.c gcdext_subdiv_step.c:
Remove private mpn_zero_p.
* tune/tuneup.c (tune_mul): Tune for MUL_TOOM44_THRESHOLD.
(tune_sqr): Tune for SQR_TOOM4_THRESHOLD.
* tune/ (TUNE_MPN_SRCS_BASIC): Add toom44_mul.c and
* (gmp_mpn_functions): Toom function updates.
* Rename mpn/mul_toomMN.c to mpn/toomMN_mul.c. Function names changed
* mpn/toomMN_mul.c: Add scratch parameter. Do recursive multiplies
properly. Misc tuning. Remove CHECK and TIMING code.
* mpn/toom2_sqr.c, mpn/toom3_sqr.c, mpn/toom4_sqr.c: New files.
* gmp-impl.h (mpn_toomMN_mul_itch): Several new functions.
(mpn_zero_p): New functions.
Add various TOOM4/TOOM44 related parameters.
Update mpn_toomMN_mul prototypes.
* mpn/generic/mul_n.c (mpn_mul_n): Call mpn_toom44_mul. Use TMP_BALLOC
instead of malloc.
(mpn_sqr_n): Analogous changes.
* mpn/generic/mul.c: Update unbalanced toom code to pass scratch space.
2008-12-21 Torbjorn Granlund <>
* mpz/nextprime.c: Add TMP_SDECL/MARK/FREE.
2008-12-20 Torbjorn Granlund <>
* mpn/generic/sqrtrem.c (mpn_sqrtrem1): Rewrite, improve interface.
(invsqrttab): New table, remove table approx_tab.
(mpn_sqrtrem2): Optimize, update mpn_sqrtrem1 call.
(mpn_sqrtrem): Update mpn_sqrtrem1 call.
2008-12-18 Torbjorn Granlund <>
* mpz/nextprime.c: Run 10 mpz_millerrabin tests (was 5).
Give credit to authors.
* mpn/x86_64/redc_1.asm: Align stack as mandated by ABI.
* mpn/x86_64/divrem_2.asm: Add some comments.
* mpn/x86_64/darwin.m4: New file.
* Use x86_64/darwin.m4.
2008-12-15 Torbjorn Granlund <>
* doc/projects.html: Remove GCD and division projects, update text on
* doc/tasks.html: Add a caution about that the file is somewhat
2008-12-14 Torbjorn Granlund <>
* mpn/alpha/ev6/aorsmul_1.asm: New file (same code for mpn_addmul_1,
much improved for mpn_submul_1).
* mpn/alpha/ev6/addmul_1: File removed.
* mpn/alpha/ev6/submul_1: File removed.
2008-12-09 Torbjorn Granlund <>
From David Harvey:
* mpn/x86_64/mul_basecase.asm: Further tweaks for code size and speed.
* mpn/powerpc64/mode64/divrem_1.asm: Rewrite.
* mpn/powerpc64/mode64/mul_basecase.asm: New file.
2008-12-08 Torbjorn Granlund <>
* mpn/powerpc64/mode64/gmp-mparam.h: New file.
* gmp-impl.h: Additional cleanups.
(mpn_set_str_compute_powtab): New prototype.
(mpn_powm, mpn_powlo): New prototypes.
* mpz/pow_ui.c: Handle some small exponents locally.
2008-12-07 Torbjorn Granlund <>
* mpn/generic/set_str.c: Remove prototypes (they are in gmp-impl.h).
* tune/set_strs.c, tune/set_strb.c: Make prototypes effective by moving
the #define mpn_set_str* before including gmp-impl.h.
* All files: Change _PROTO => __GMP_PROTO.
* tune/speed.c (routine): Remove non-working choice mpn_set_str_subquad.
* tune/common.c (speed_mpn_dc_set_str): Remove, it is broken.
* mpn/generic/toom_interpolate_7pts.c (divexact_2exp): Make this static,
and inline it.
* gmp-impl.h: Major cleanup.
(Remove formal parameter names. Use __GMP_PROTO consistently. Move
__GMP_PROTO and __MPN use to adjacent lines for declared function.
Fix typos. Remove code inside #if 0.)
* (gmp_mpn_functions): Add mul_toom33. Reformat.
2008-12-05 Torbjorn Granlund <>
* mpn/generic/redc_1.c: New file.
* mpn/generic/redc_2.c: New file.
* (gmp_mpn_functions): List redc_1 and redc_2.
(HAVE_NATIVE): Likewise.
* tune/common.c (speed_mpn_redc_1): Renamed from speed_redc.
* tune/speed.c (routine): Remove "redc", and "mpn_redc_1".
* tune/speed.h (SPEED_ROUTINE_REDC_1): Renamed from SPEED_ROUTINE_REDC.
Updated call.
* tune/tuneup.c (tune_powm): Update redc call.
2008-12-04 Torbjorn Granlund <>
* mpn/x86_64/sqr_basecase.asm: Inline a combined diagonal product code
and addlsh1 loop. Misc cleanup.
2008-12-02 Torbjorn Granlund <>
* mpn/x86_64/sqr_basecase.asm: New file.
2008-11-30 Torbjorn Granlund <>
* mpn/generic/sqr_basecase.c: Fix typo in mpn_addmul_2s variant.
2008-11-28 Torbjorn Granlund <>
* mpn/x86_64/redc_1.asm: Rewrite.
2008-11-27 Torbjorn Granlund <>
* tests/refmpn.c (refmpn_redc_1): New function.
2008-11-25 Torbjorn Granlund <>
* mpn/x86/k7/aorsmul_1.asm: Actually handle mpn_submul_1.
2008-11-23 Torbjorn Granlund <>
* mpn/x86_64/divrem_1.asm: Rewrite.
* alpha/divrem_2.asm: New file.
* powerpc32/divrem_2.asm: New file.
* powerpc64/mode64/divrem_2.asm: New file.
* x86/divrem_2.asm: New file.
* x86_64/divrem_2.asm: New file.
* tests/refmpn.c (refmpn_divrem_2): New function.
2008-11-22 Torbjorn Granlund <>
* mpn/x86/k7/mul_1.asm: Rewrite for smaller size and better speed.
* mpn/x86/k7/aorsmul_1.asm: Likewise.
* acinclude.m4 (GMP_VERSION): Include last component even when zero.
2008-11-21 Torbjorn Granlund <>
* mpn/x86_64/README: Rewrite.
* tests/devel/try.c (malloc_region, mprotect_maybe): Add casts for
printf type correctness.
Bump version info.
2008-11-20 Torbjorn Granlund <>
* gmp-impl.h: Rename modlimb_invert to binvert_limb.
* tune/speed.h: Likewise.
* tune/modlinv.c: Likewise.
* tune/common.c: Likewise.
* tests/t-modlinv.c: Likewise.
* tests/t-constants.c: Likewise.
* mpn/sparc64/mode1o.c: Likewise.
* mpn/alpha/dive_1.c: Likewise.
* mpn/sparc64/dive_1.c: Likewise.
* mpn/generic/mode1o.c: Likewise.
* mpn/generic/dive_1.c: Likewise.
* mpn/generic/bdivmod.c: Likewise.
* mpn/alpha/mode1o.asm: Likewise.
* mpn/asm-defs.m4: Likewise.
* mpn/ia64/mode1o.asm: Likewise.
* mpn/powerpc32/README: Likewise.
* mpn/powerpc32/mode1o.asm: Likewise.
* mpn/powerpc64/mode64/dive_1.asm: Likewise.
* mpn/powerpc64/mode64/mode1o.asm: Likewise.
* mpn/x86/dive_1.asm: Likewise.
* mpn/x86/k6/mmx/dive_1.asm: Likewise.
* mpn/x86/k6/mode1o.asm: Likewise.
* mpn/x86/k7/dive_1.asm: Likewise.
* mpn/x86/k7/mode1o.asm: Likewise.
* mpn/x86/p6/dive_1.asm: Likewise.
* mpn/x86/p6/mode1o.asm: Likewise.
* mpn/x86/pentium/dive_1.asm: Likewise.
* mpn/x86/pentium/mode1o.asm: Likewise.
* mpn/x86/pentium4/sse2/dive_1.asm: Likewise.
* mpn/x86/pentium4/sse2/mode1o.asm: Likewise.
* mpn/x86_64/dive_1.asm: Likewise.
* mpn/x86_64/mode1o.asm: Likewise.
* mpn/x86_64/aors_n.asm: Replace with slightly faster, more alignment
neutral loop.
2008-11-18 Torbjorn Granlund <>
* Remove gcd_finda related declarations.
* gmp-impl.h (mpn_gcd_finda): Remove declaration.
* mpn/ (nodist_EXTRA_libmpn_la_SOURCES): Remove gcd_finda.
* mpn/asm-defs.m4: Remove define_mpn(gcd_finda).
* mpn/x86/k6/gcd_finda.asm: Remove file.
* tests/devel/try.c (param_init): Remove mpn_gcd_finda.
(choice_array): Remove mpn_gcd_finda.
* tests/mpn/t-instrument.c (check): Remove testing of mpn_gcd_finda.
* tests/refmpn.c (refmpn_gcd_finda): Remove.
* tests/tests.h (refmpn_gcd_finda): Remove declaration.
* tune/common.c (speed_mpn_gcd_finda): Remove.
* tune/gcd_finda_gen.c: Remove file.
* tune/speed.h (speed_mpn_gcd_finda): Remove declaration.
* tune/speed.c (routine): Remove mpn_gcd_finda entry.
* tests/mpz/t-powm.c: Print test number when failing a test.
* mpn/x86_64/redc_1.asm (CALL): Move from here...
* mpn/x86_64/x86_64-defs.m4: here.
* gmp-impl.h (mpn_jacobi_base): Remove parameter names.
2008-11-11 Torbjorn Granlund <>
* tests/mpf/t-conv.c: Add some specific tests, supplementing the random
2008-11-09 Torbjorn Granlund <>
* mpf/set_str.c: Default 'base' before letting exp_base inherit it.
* tests/cxx/ Use the right precision for all float constants.
2008-11-08 Torbjorn Granlund <>
* doc/gmp.texi (Float Comparison): Update mpf_eq documentation.
* mpf/eq.c: Compare the right number of bits.
2008-11-02 Torbjorn Granlund <>
Undo, it made testing too slow:
* tests/mpz/t-mul.c: Use slower geometric progression for operand
* mpn/x86/k7/mod_34lsub1.asm: Use movzb for masking low 8 bits.
2008-10-31 Niels Möller <>
* mpn/generic/hgcd2.c (div1): New function (taken from old gcdext
(mpn_hgcd2): Use single precision for the second half of the work.
2008-10-30 Torbjorn Granlund <>
* mpn/x86/p6/sse2/gmp-mparam.h: New file.
2008-10-29 Torbjorn Granlund <>
* (x86 fat_path): Add "x86/p6/sse2".
* mpn/x86/fat/fat.c (__gmpn_cpuvec_init): Recognize sse2 capable p6
(pentiumm, core2).
* mpn/x86/p6/sse2/mul_1.asm: New file.
* mpn/x86/p6/sse2/addmul_1.asm: New file.
* mpn/x86/p6/sse2/submul_1.asm: New file.
* mpn/x86/p6/sse2/mul_basecase.asm: New file.
* mpn/x86/p6/sse2/sqr_basecase.asm: New file.
* mpn/x86/p6/sse2/popcount.asm: New file.
* mpn/x86/fat/fat.c (__gmpn_cpuvec_init): Handle "extended" fields for
model and family.
2008-10-28 Torbjorn Granlund <>
From Mickael Gastineau:
* (gmp_urandomm_ui, gmp_urandomb_ui): Add __GMP_DECLSPEC.
2008-10-27 Torbjorn Granlund <>
* (mpn_gcdext_1): Remove bogus __GMP_ATTRIBUTE_PURE.
2008-10-27 Niels Möller <>
* tune/common.c (speed_mpn_hgcd): Call mpn_hgcd_matrix_init once
for each call to mpn_hgcd.
(speed_mpn_hgcd_lehmer): Likewise.
2008-10-26 Torbjorn Granlund <>
* Point to p6/sse2 for pentiumm and core2.
* gmp-impl.h (mpn_add_nc, mpn_sub_nc): Move these macros to after fat
* tune/common.c, tune/speed.c, tune/speed.h:
Add speed measurement of mpn_bdiv_dbm1c.
2008-10-24 Torbjorn Granlund <>
* mpn/x86_64/gmp-mparam.h (MUL_FFT_TABLE2, SQR_FFT_TABLE2): Extend.
* mpz/nextprime.c: Move declarations to function beginning.
2008-10-23 Niels Möller <>
* gmp-impl.h (DECL_gcdext_1): Deleted.
2008-10-22 Torbjorn Granlund <>
* mpn/x86_64/atom/aors_n.asm: New file.
* mpn/x86_64/atom/gmp-mparam.h: New file.
2008-10-21 Torbjorn Granlund <>
With Neils Möller:
* mpz/nextprime.c: Rewrite.
* tests/devel/try.c (main): Use strtol for 's' and 'S' optargs.
* mpn/x86_64/pentium4/rshift.asm: Misc cleanups.
* mpn/x86_64/pentium4/lshift.asm: Likewise.
* mpn/x86_64/pentium4/aors_n.asm: Use fewer registers.
* Set up specific path for x86_64/atom.
2008-10-21 Niels Möller <>
* mpn/ (nodist_EXTRA_libmpn_la_SOURCES): Removed
* mpn/generic/qstack.c: Deleted obsolete file.
2008-10-20 Torbjorn Granlund <>
* mpn/x86_64/core2/aorsmul_1.asm: New file.
2008-10-19 Torbjorn Granlund <>
* mpn/x86_64/aors_n.asm: Remove redundant MULFUNC_PROLOGUE.
* gmp-impl.h (popc_limb): Remove redundant checks of GMP_LIMB_BITS
inside several of these macros.
2008-10-17 Torbjorn Granlund <>
* tests/mpz/t-mul.c: Use slower geometric progression for operand
sizes. Do every other tests for same size operands.
2008-10-15 Torbjorn Granlund <>
* mpn/x86_64/mul_basecase.asm: Simplify addressing in epilogue.
* mpn/mips64/divrem_1.asm: Remove file, it is n32-only, and uses an old
* config.guess, config.sub, Support Intel Atom processor.
2008-10-10 Torbjorn Granlund <>
* mpq/mul.c: Fix typo in last change.
2008-10-09 Torbjorn Granlund <>
* tests/refmpn.c (refmpn_sb_divrem_mn): Work around a gcc bug.
2008-10-08 Torbjorn Granlund <>
* mpq/mul.c: Use TMP_ALLOC. Cleanup.
* mpq/div.c: Likewise.
* mpn/x86_64/mul_basecase.asm: Use lea directly for loading entry point
2008-10-09 Niels Möller <>
* mpn/x86/k7/gmp-mparam.h: Updated GCD-related values.
2008-10-05 Torbjorn Granlund <>
* mpn/generic/mul_fft.c (mpn_mul_fft_internal): Do store
mpn_fft_norm_modF return value, if (rec).
2008-10-04 Torbjorn Granlund <>
* mpn/x86_64/aorsmul_1.asm: Replace with faster code.
* mpn/x86_64/mul_1.asm: Likewise.
* mpn/x86_64/addmul_2.asm: Likewise.
* mpn/x86_64/mul_2.asm: Likewise.
* mpn/x86_64/mul_basecase.asm: Likewise.
2008-10-02 Torbjorn Granlund <>
* mpn/minithres/gmp-mparam.h: Update FFT values.
2008-10-02 Niels Möller <>
* hgcd.c (mpn_hgcd_matrix_mul): Fixed normalization bug.
2008-09-24 Torbjorn Granlund <>
* Handle --enable-minithres.
* mpn/minithres/gmp-mparam.h: Update all values.
2008-09-22 Torbjorn Granlund <>
* tune/speed.c (routine): New entry for mpn_mul.
* tune/speed.h (SPEED_ROUTINE_MPN_MUL): Renamed from
(speed_mpn_mul): Renamed from speed_mpn_mul_basecase.
(SPEED_ROUTINE_MPN_MUL): Allocate our own memory of xp operand.
* tune/common.c: Corresponding changes.
2008-09-22 Niels Möller <>
* mpn/generic/gcdext.c (hgcd_mul_matrix_vector): New function,
replaces addmul2_n. Needs less copying.
(mpn_gcdext): Use hgcd_mul_matrix_vector. Updated for interface
change in mpn_gcdext_subdiv_step
* mpn/generic/hgcd.c (hgcd_matrix_mul_1): Rewritten to use
(hgcd_step): Updated for interface change in
* mpn/generic/gcdext_lehmer.c (mpn_gcdext_lehmer_n): Updated for
interface changes in mpn_hgcd_mul_matrix1_vector,
mpn_hgcd_mul_matrix1_inverse_vector and mpn_gcdext_subdiv_step.
* mpn/generic/gcd_lehmer.c (mpn_gcd_lehmer_n): Updated for
interface change in mpn_hgcd_mul_matrix1_inverse_vector.
* mpn/generic/gcdext_subdiv_step.c (mpn_gcdext_subdiv_step): Use
separate scratch arguments for the quotient and for the cofactor
* mpn/generic/hgcd2.c (mpn_hgcd_mul_matrix1_vector): Interface
change. Store first element in rp and leave ap unmodified. No
additional scratch space or copying needed. Callers that require
modification in place still need to copy one of the inputs.
(mpn_hgcd_mul_matrix1_inverse_vector): Likewise.
2008-09-22 Niels Möller <> <>
* mpn/generic/hgcd.c (hgcd_matrix_mul_1): Use mpn_addaddmul_1msb0.
* mpn/generic/hgcd2.c (mpn_hgcd_mul_matrix1_vector): Likewise.
* mpn/generic/gcd.c: Use libspeed for timing measurements.
* gmp-impl.h: Declare mpn_addaddmul_1msb0.
* mpn/asm-defs.m4: Added addaddmul_1msb0.
* mpn/x86_64/addaddmul_1msb0.asm: New file.
* (gmp_mpn_functions_optional): Added
(HAVE_NATIVE): List addaddmul_1msb0.
2008-09-21 Torbjorn Granlund <>
* mpn/generic/get_str.c (GET_STR_DC_THRESHOLD): Remove default.
Misc code cleanups.
* gmp-impl.h (mpn_dc_set_str_itch): Allocate GMP_LIMB_BITS more limbs.
* mpn/generic/set_str.c:
(mpn_dc_set_str): Remove impossible case, replace by an ASSERT.
2008-09-18 Torbjorn Granlund <>
* mpn/alpha/ev6/gmp-mparam.h (DIVEXACT_BY3_METHOD): Define.
* mpn/ia64/diveby3.asm: Remove.
* mpn/x86/diveby3.asm: Remove.
* mpn/x86/k6/diveby3.asm: Remove.
* mpn/x86/k7/diveby3.asm: Remove.
* mpn/x86/p6/diveby3.asm: Remove.
* mpn/x86/pentium/diveby3.asm: Remove.
* mpn/x86_64/diveby3.asm: Remove.
* mpn/x86/pentium4/sse2/diveby3.asm: Remove.
* (HAVE_NATIVE): List divexact_by3c.
* gmp-impl.h (mpn_divexact_by3c): Override's definition.
(DIVEXACT_BY3_METHOD): Don't default to 0 if
2008-09-18 Niels Möller <>
* mpn/generic/gcd.c (main): Added code for tuning of CHOOSE_P.
* mpn/generic/hgcd.c (mpn_hgcd_matrix_mul): Assert that inputs are
2008-09-17 Niels Möller <> <>
* mpn/generic/gcdext.c (mpn_gcdext): p = n/5 caused a
slowdown for large inputs. As a compromise, use p = n/2 for the
first iteration, and p = n/3 for the rest. Handle the first
iteration specially, since the initial u0 and u1 are trivial.
* mpn/x86_64/gmp-mparam.h (GCDEXT_DC_THRESHOLD): Reduced threshold
from 409 to 390.
* mpn/generic/gcdext.c (CHOOSE_P): New macro. Use p = n/5.
(mpn_gcdext): Use CHOOSE_P, and generalized the calculation of
scratch space.
* tune/tuneup.c (tune_hgcd): Use default step factor.
* mpn/x86_64/gmp-mparam.h: (GCD_DC_THRESHOLD): Reduced from 493 to
* mpn/generic/gcd.c (CHOOSE_P): New macro, to determine the
split when calling hgcd. Use p = 2n/3, as that seems better than
the more obvious split p = n/2.
(mpn_gcd): Use CHOOSE_P, and generalized the calculation of
scratch space.
2008-09-16 Torbjorn Granlund <>
* mpn/generic/toom_interpolate_7pts.c: Use new mpn_divexact_byN
* gmp-impl.h (mpn_divexact_by3, mpn_divexact_by5, mpn_divexact_by7,
mpn_divexact_by9, mpn_divexact_by11, mpn_divexact_by13,
mpn_divexact_by15): New macros, defined in terms of mpn_bdiv_dbm1.
* (gmp_mpn_functions): List bdiv_dbm1c.
(HAVE_NATIVE): Likewise.
* mpn/asm-defs.m4: Define bdiv_dbm1c.
* gmp-impl.h (mpn_bdiv_dbm1c): Declare.
(mpn_bdiv_dbm1): New macro.
* mpn/generic/bdiv_dbm1c.c: New file.
* mpn/alpha/bdiv_dbm1c.asm: New file.
* mpn/ia64/bdiv_dbm1c.asm: New file.
* mpn/powerpc32/bdiv_dbm1c.asm: New file.
* mpn/powerpc64/mode64/bdiv_dbm1c.asm: New file.
* mpn/x86/bdiv_dbm1c.asm: New file.
* mpn/x86_64/bdiv_dbm1c.asm: New file.
* mpn/generic/diveby3.c: Add mpn_bdiv_dbm1c based function.
Choose function depending on DIVEXACT_BY3_METHOD.
* gmp-impl.h (DIVEXACT_BY3_METHOD): Provide default.
2008-09-16 Niels Möller <>
* mpn/generic/hgcd.c (mpn_hgcd_addmul2_n): Moved function to
gcdext.c, where it is used.
* mpn/generic/gcdext.c (addmul2_n): Moved and renamed, was
mpn_hgcd_addmul2_n. Made static. Deleted input normalization.
Deleted rn argument.
(mpn_gcdext): Updated calls to addmul2_n, and added assertions.
* gmp-impl.h (MPN_HGCD_MATRIX_INIT_ITCH): Increased storage by four limbs.
(MPN_HGCD_LEHMER_ITCH): Reduced storage by one limb.
* mpn/generic/hgcd.c (mpn_hgcd_matrix_init): Use two extra limbs.
(hgcd_step): Use overlapping arguments to mpn_tdiv_qr.
(mpn_hgcd_matrix_mul): Deleted normalization code. Tigher bounds
for the element size of the product. Needs two extra limbs of
storage for the elements.
(mpn_hgcd_itch): Updated storage calculation.
* mpn/generic/gcd_subdiv_step.c (mpn_gcd_subdiv_step): Use
overlapping arguments to mpn_tdiv_qr. Use mpn_zero_p.
* mpn/generic/gcd.c (mpn_gcd): Use mpn_zero_p.
2008-09-15 Niels Möller <>
* mpn/generic/hgcd.c (mpn_hgcd_matrix_init): Updated for deleted
tp pointer.
(hgcd_matrix_update_q): Likewise.
(mpn_hgcd_matrix_mul): Likewise.
(mpn_hgcd_itch): Updated calculation of scratch space.
* gmp-impl.h (struct hgcd_matrix): Deleted tp pointer.
(MPN_HGCD_MATRIX_INIT_ITCH): Reduced storage.
(mpn_hgcd_step, MPN_HGCD_STEP_ITCH): Deleted declarations.
2008-09-15 Niels Möller <> <>
* mpn/x86_64/gmp-mparam.h (MATRIX22_STRASSEN_THRESHOLD): New
* mpn/generic/hgcd.c (mpn_hgcd_matrix_mul): Use mpn_matrix22_mul.
(mpn_hgcd_itch): Updated calculation of scratch space. Use
count_leading_zeros to get the recursion depth.
* mpn/generic/gcd.c (mpn_gcd): Fixed calculation of scratch space,
and use mpn_hgcd_itch.
2008-09-15 Niels Möller <>
* tune/tuneup.c (tune_matrix22_mul): New function.
(all): Use it.
* tune/common.c (speed_mpn_matrix22_mul): New function.
* tune/ (TUNE_MPN_SRCS_BASIC): Added matrix22_mul.c.
* tests/mpn/t-matrix22.c: Use MATRIX22_STRASSEN_THRESHOLD to
select sizes for tests.
* gmp-impl.h (MATRIX22_STRASSEN_THRESHOLD): New threshold
* (gmp_mpn_functions): Added matrix22_mul.
* gmp-impl.h: Added declarations for mpn_matrix22_mul and related
* mpn/ (nodist_EXTRA_libmpn_la_SOURCES): Added
* tests/mpn/ (check_PROGRAMS): Added t-matrix22.
* tests/mpn/t-matrix22.c: New file.
* mpn/generic/matrix22_mul.c: New file.
2008-09-11 Niels Möller <>
* tune/tuneup.c: Updated tuning of gcdext.
* mpn/x86_64/gmp-mparam.h (GCDEXT_DC_THRESHOLD): Reduced threshold
from 713 to 409.
2008-09-11 Niels Möller <>
* gmp-impl.h: Updated for gcdext changes.
(GCDEXT_DC_THRESHOLD): New constant, renamed from
* mpn/generic/gcdext.c (compute_v): Accept non-normalized a and b
as inputs.
(mpn_gcdext): Rewrote and simplified. Now uses the new mpn_hgcd
* mpn/generic/hgcd.c (mpn_hgcd_addmul2_n): Renamed from addmul2_n
and made non-static. Changed interface to take non-normalized
inputs, and only two size arguments.
(mpn_hgcd_matrix_mul): Simplified using new mpn_hgcd_addmul2_n.
* mpn/generic/gcdext_lehmer.c (mpn_gcdext_lehmer_itch): Deleted
(mpn_gcdext_lehmer_n): Renamed from mpn_gcd_lehmer. Now takes
inputs of equal size. Moved the code for the division step to a
separate function...
* mpn/generic/gcdext_subdiv_step.c (mpn_gcdext_subdiv_step): New
file, new function.
* (gmp_mpn_functions): Added gcdext_subdiv_step.
2008-09-10 Torbjorn Granlund <>
* tests/devel/anymul_1.c: Include <string.h>.
* Unconditionally include <cstdio>.
2008-09-10 Niels Möller <>
* tune/common.c: #if:ed out speed_mpn_gcd_binary and
* tune/speed.c (routine): #if:ed out mpn_gcd_binary, mpn_gcd_accel
and find_a.
* tune/ (libspeed_la_SOURCES): Removed gcd_bin.c
gcd_accel.c gcd_finda_gen.c.
* tune/tuneup.c: Enable tuning of GCD_DC_THRESHOLD.
* mpn/generic/gcd.c (mpn_gcd): Rewrote and simplified. Now uses
the new mpn_hgcd interface.
* */gmp-mparam.h: Renamed GCD_SCHOENHAGE_THRESHOLD to
* mpn/generic/gcd_lehmer.c (mpn_gcd_lehmer_n): Renamed (was
mpn_gcd_lehmer). Now takes inputs of equal size.
* mpn/generic/gcd_lehmer.c (mpn_gcd_lehmer): Reintroduced gcd_2,
to get better performance for small inputs.
* mpn/generic/hgcd.c: Don't hardcode small HGCD_THRESHOLD.
* mpn/x86_64/gmp-mparam.h (HGCD_THRESHOLD): Reduced from 145 to
* */gmp-mparam.h: Renamed HGCD_SCHOENHAGE_THRESHOLD to
2008-09-09 Torbjorn Granlund <>
* doc/gmp.texi: Fix a typo and clarify mpn_gcdext docs.
2008-09-09 Niels Möller <>
* tune/common.c (speed_mpn_hgcd, speed_mpn_hgcd_lehmer): Adapted
to new hgcd interface.
* gmp-impl.h (MPN_HGCD_LEHMER_ITCH): New macro.
* hgcd.c (mpn_hgcd_lehmer): Renamed function, from hgcd_base. Made
* gcd_lehmer.c (mpn_gcd_lehmer): Use hgcd2 also for n == 2.
* gcdext_lehmer.c (mpn_gcdext_lehmer): Simplified code for
division step. Added proper book-keeping of swaps, which affect
the sign of the returned cofactor.
* tests/mpz/t-gcd.c (one_test): Display co-factor when mpn_gcdext
* gcd_lehmer.c (mpn_gcd_lehmer): At end of loop, need to handle
the special case n == 1 correctly.
* gcd_subdiv_step.c (mpn_gcd_subdiv_step): Simplified function.
The special cancellation logic is not needed here.
2008-09-08 Torbjorn Granlund <>
* mpn/generic/invert.c: Add working but slow code.
* mpn/x86_64/x86_64-defs.m4 (R32, R8): New macros.
* mpn/ia64/submul_1.asm: Move some labels for broader assembler
* gmp-impl.h (mpn_mul_3, mpn_mul_4): Declare.
* tests/tests.h (refmpn_mul_3, refmpn_mul_4): Declare.
* tests/try.c (param_init): Set things up for mpn_mul_3 and mpn_mul_4.
(choice_array): Likewise.
(call): Likewise.
* mpn/ (nodist_EXTRA_libmpn_la_SOURCES):
Add mul_3.c and mul_4.
* mpn/asm-defs.m4: Define mul_3 and mul_4.
* tests/refmpn.c (refmpn_mul_N): New function.
(refmpn_mul_2): Remove old definition, call refmpn_mul_N.
(refmpn_mul_3, refmpn_mul_4): New functions.
* tune/common.c (speed_mpn_mul_3, speed_mpn_mul_4): New functions.
* tune/speed.h (speed_mpn_mul_3, speed_mpn_mul_4): Declare.
* tune/speed.c (routine): New entries for mpn_mul_2 and mpn_mul_3.
* Update to libtool 1.5.24.
* mpn/generic/mul_toom22.c: Compute s and t more cleverly.
2008-09-08 Niels Möller <>
* tests/mpn/t-hgcd.c: Updated tests. Rewrite of hgcd_ref.
* mpn/generic/gcdext_lehmer.c (mpn_gcdext_lehmer_itch): New function.
(mpn_gcdext_lehmer): Various bugfixes.
* gcdext.c (mpn_gcdext): Allocate scratch space for gcdext_lehmer.
* mpn/generic/gcd_lehmer.c (gcd_2): ASSERT that inputs are odd.
(mpn_gcd_lehmer): Added tp argument, for scratch space. Make both
arguments odd before calling gcd_2.
* mpn/generic/hgcd.c (mpn_hgcd): Allow the trivial case n <= 2,
and return 0 immediately.
* gmp-impl.h (MPN_EXTRACT_NUMB): New macro.
* (gmp_mpn_functions): Added gcdext_lehmer.
2008-09-05 Torbjorn Granlund <>
* mpn/generic/toom_interpolate_7pts.c: Use mpn_divexact_by3c instead of
* doc/texinfo.tex: Update to 2007-06-29.13.
* doc/gmp.texi: Update GMP site URL. Fix some typos.
* demos/pexpr.c (main): Allow bases up to 62.
* gmp-impl.h: Remove formal parameter names from function prototypes.
* config.guess: Recognize recent AMD and Itanium CPUs.
Default X86 CPU recognition to configfsf.guess' value.
* Handle core2 separately from athlon64.
2008-09-05 Niels Möller <>
* */, configure, aclocal.m4, Removed files
from repository. They're instead generated by automake and
autoconf before distribution.
2008-08-25 Torbjorn Granlund <>
* mpf/set_str.c: Allocate mantissa space based on mantissa size,
not on destination variable space.
* mpf/set_str.c: Accept unary plus before exponent.
2008-08-06 Torbjorn Granlund <>
* mpn/generic/mul_toom22.c: Add statistics gathering functionality,
triggered by cpp predef STAT.
From David Harvey:
* mpn/generic/mul_toom22.c: Decrease scratch space usage.
2008-08-02 Torbjorn Granlund <>
* tests/misc/t-scanf.c: Avoid negative arguments to _ui functions.
* tests/misc/t-printf.c: Likewise.
* acinclude.m4 (X86_PATTERN): Add geode.
* acinclude.m4 (CL_AS_NOEXECSTACK): Avoid -q flag to grep.
2008-08-01 Torbjorn Granlund <>
* acinclude.m4 (CL_AS_NOEXECSTACK): New.
* mpn/ Use ASM_FLAGS (defined by CL_AS_NOEXECSTACK).
* gmpxx.h (__GMP_DBL_LIMBS): Use DBL_MAX_EXP instead of
std::numeric_limits<double>::max_exponent for better portability.
2008-07-29 Torbjorn Granlund <>
* gmpxx.h (__GMP_DBL_LIMBS): New #define.
(__GMP_ULI_LIMBS): New #define.
(__GMPXX_TMP_UI): New macro.
(__GMPXX_TMP_SI): New macro.
(__GMPXX_TMP_D): New macro.
(struct __gmp_binary_and): Rewrite, using the new macros.
(struct __gmp_binary_ior): Likewise.
(struct __gmp_binary_xor): Likewise.
2008-07-28 Torbjorn Granlund <>
* tests/cxx/ Add some tests for logical operations.
2008-07-24 Torbjorn Granlund <>
* gmpxx.h: Use __GMPZ_* instead of __GMPZZ_* for bitwise ops, remove
Remove repeated #undefs.
(__gmp_alloc_cstring): Declare freefunc as extern "C".
2008-07-23 Torbjorn Granlund <>
* (__GMP_CC): New define, undocumented for now.
(__GMP_CFLAGS): Likewise.
2008-07-21 Torbjorn Granlund <>
* tests/amd64check.c: Fix a printf type clash.
* mpz/realloc.c: Amend last fix.
* Include <cstdlib> for C++.
* Handle new gcc 4.3 inline semantics defaults.
* configfsf.guess: Update to version of 2008-04-14.
* configfsf.sub: Update to version of 2008-06-16.
* Separate core2 and athlon64 flags handling.
2008-06-19 Torbjorn Granlund <>
* config.guess: Recognize pentiumm and AMD geode.
* config.sub: Likewise.
* Likewise.
2008-06-02 Torbjorn Granlund <>
* Disallow odd nails sizes.
* Inherit default gcc_cflags/gcc_64_cflags everywhere.
2008-05-23 Torbjorn Granlund <>
* mpz/init2.c: Rewrite to avoid internal overflow and to detect mpz_t
* mpz/realloc2.c: Likewise.
* mpz/realloc.c: Detect mpz_t overflow.
2008-05-22 Torbjorn Granlund <>
* (sparc): Remove -fast, it causes documented
* config.guess: Properly handle the "extended" variants of x86 cpuid.
2008-05-09 Torbjorn Granlund <>
* gmp-impl.h (mpn_mul_fft): Now void.
(udiv_qrnnd_preinv3): Special case for constant (nl).
2008-05-08 Torbjorn Granlund <>
* mpn/generic/mul_fft.c: Clean up types in TRACE (printf (...)).
(TRACE): Redefine to allow command line control.
(mpn_mul_fft_internal): Now void, remove return value.
(mpn_mul_fft): Likewise.
(MPN_FFT_TABLE2_SIZE): Up size fro 256 to 512.
(mpn_fft_fft): Call mpn_fft_mul_2exp_modF just once instead of twice,
then add/subtract result. Get rid of temp allocation as a result.
Remove some redundant CNST_LIMB.
(mpn_fft_fftinv): Analogous changes.
(mpn_fft_sub_modF): Re-enable, now needed by mpn_fft_fft and
2008-03-10 Torbjorn Granlund <>
* tests/mpz/t-mul.c (main): Let GMP_CHECK_FFT mean largest allowed
power-of-2 of test operands.
2008-02-28 Torbjorn Granlund <>
* tests/cxx/ (check_mpz): Expect floor rounding for right
2008-02-27 Torbjorn Granlund <>
* mpz/mul_i.h: Check sml's size (not the signed small_mult).
* longlong.h (umul_ppmm) [alpha]: Define using __builtin_alpha_umulh
when possible.
* longlong.h (count_trailing_zeros): Force destination register mode.
* gmpxx.h (struct __gmp_binary_rshift): Use floor rounding, not
* gmpxx.h (__gmp_binary_and, __gmp_binary_ior, __gmp_binary_xor):
Add variants with unsigned long int argument.
* config.sub: Recog geode.
* config.guess: Likewise.
* acinclude.m4 (X86_PATTERN): Likewise.
2008-02-10 Torbjorn Granlund <>
* mpn/x86/p6/aors_n.asm: Use Zdisp to work around GNU as bug.
* mpn/x86/x86-defs.m4 (Zdisp): Add more instructions.
2008-02-08 Torbjorn Granlund <>
* mpn/x86_64/aors_n.asm: New file.
* mpn/x86_64/add_n.asm: Delete.
* mpn/x86_64/sub_n.asm: Delete.
2008-02-07 Torbjorn Granlund <>
* mpn/x86/k6/mmx/dive_1.asm: Fix typo in last change.
2007-12-10 Torbjorn Granlund <>
* mpf/set_str.c (mpf_set_str): Write own code for converting the
exponent, avoids strtol base < 36 limitation.
2007-10-28 Torbjorn Granlund <>
* gmp-impl.h (mpn_dc_get_str_itch): New macro.
(mpn_dc_get_str_powtab_alloc): New macro.
(struct powers): Add field "shift".
* mpn/generic/get_str.c: Compute powers without low zero limbs; all
functions modified. Correct temporary allocation. Misc cleanups.
* mpn/generic/set_str.c: Compute powers without low zero limbs; all
functions modified.
(mpn_dc_set_str): Remove impossible case, replace by an ASSERT.
2007-10-26 Torbjorn Granlund <>
* mpn/generic/set_str.c: Remove default thresholds, not in gmp-impl.h.
(mpn_dc_set_str): Insert ASSERT_ALWAYS in a presumably dead code arm.
2007-10-22 Torbjorn Granlund <>
* gmp-impl.h (mpn_add_nc): Define as inline function, unless NATIVE.
(mpn_sub_nc): Likewise.
2007-10-17 Torbjorn Granlund <>
* tests/misc/t-printf.c: Fix a printf type clash.
* tests/mpq/t-get_str.c: Likewise.
* tests/mpz/t-import.c: Likewise.
* acinclude.m4: Conditionally disable some tests when compiled by a C++
* gmp-impl.h (udiv_qrnnd_preinv3): Remove an unused variable.
* mpn/generic/hgcd.c: Add some WANT_ASSERTs to shut up warnings.
2007-10-08 Torbjorn Granlund <>
* mpn/powerpc64/elf.m4 (LEAL): Define as an alias for LEA.
* mpn/powerpc32/darwin.m4 (LEAL): Likewise.
* mpn/powerpc64/aix.m4: Likewise.
* mpn/powerpc64/vmx/popcount.asm: Use LEAL.
* mpn/powerpc64/darwin.m4 (LEAL): New name for LEA, since it is only
usable for local symbols.
(LEA): Replace with code for external references.
* mpn/powerpc32/vmx/mod_34lsub1.asm: Use LEAL.
2007-10-07 Torbjorn Granlund <>
* mpn/x86/dive_1.asm: Use LEA, remove explicit movl_eip_*.
* mpn/x86/k6/mode1o.asm: Likewise.
* mpn/x86/k6/mmx/dive_1.asm: Likewise.
* mpn/x86/k7/dive_1.asm: Likewise.
* mpn/x86/k7/mode1o.asm: Likewise.
* mpn/x86/p6/dive_1.asm: Likewise.
* mpn/x86/p6/mode1o.asm: Likewise.
* mpn/x86/pentium4/sse2/dive_1.asm: Likewise.
* mpn/x86/pentium4/sse2/mode1o.asm: Likewise.
* mpn/x86/pentium4/sse2/popcount.asm: Likewise.
* mpn/x86/p6/aors_n.asm: Table cycle counts.
* mpn/x86/k7/mod_34lsub1.asm: Fix over-optimisitc cycle count claims.
* mpn/x86/x86-defs.m4 (DEF_OBJECT, END_OBJECT): New define's.
* mpn/x86/darwin.m4 (LEA): Put also movl_eip_XX into EPILOGUE_cpu.
Expect target register to have prepended %.
* mpn/x86_64/add_n.asm: Use L() for labels.
* mpn/x86_64/addlsh1_n.asm: Likewise.
* mpn/x86_64/addmul_2.asm: Likewise.
* mpn/x86_64/aorrlsh_n.asm: Likewise.
* mpn/x86_64/aorsmul_1.asm: Likewise.
* mpn/x86_64/com_n.asm: Likewise.
* mpn/x86_64/copyd.asm: Likewise.
* mpn/x86_64/copyi.asm: Likewise.
* mpn/x86_64/diveby3.asm: Likewise.
* mpn/x86_64/logops_n.asm: Likewise.
* mpn/x86_64/lshsub_n.asm: Likewise.
* mpn/x86_64/mul_1.asm: Likewise.
* mpn/x86_64/mul_2.asm: Likewise.
* mpn/x86_64/mul_basecase.asm: Likewise.
* mpn/x86_64/popham.asm: Likewise.
* mpn/x86_64/redc_1.asm: Likewise.
* mpn/x86_64/rsh1add_n.asm: Likewise.
* mpn/x86_64/rsh1sub_n.asm: Likewise.
* mpn/x86_64/rshift.asm: Likewise.
* mpn/x86_64/sub_n.asm: Likewise.
* mpn/x86_64/sublsh1_n.asm Likewise.
* mpn/x86_64/pentium4/aors_n.asm: Likewise.
* mpn/x86_64/pentium4/lshift.asm: Likewise.
* mpn/x86_64/pentium4/rshift.asm: Likewise.
* mpn/x86_64/x86_64-defs.m4: New file, defining LEA, DEF_OBJECT, and
* mpn/generic/mul.c: Put TMP_DECL as last decl.
2007-10-06 Torbjorn Granlund <>
* mpn/x86/pentium4/sse2/popcount.asm: New file.
2007-09-26 Torbjorn Granlund <>
* mpz/get_str.c: Cast a char index to int to shut up compilers.
* dc_div_qr.c: Pass dummy scratch argument to mpn_invert.
* dc_divappr_q.c: Likewise.
* mu_div_qr.c: Likewise.
* mu_divappr_q.c: Likewise.
* mu_div_q.c: Likewise.
* divexact.c: Likewise.
* mpn/generic/invert.c: New file, placeholder for now.
2007-09-24 Torbjorn Granlund <>
* mpn/generic/toom_interpolate_5pts.c: New file, contents from
* mpn/generic/mul_n.c (mpn_toom3_interpolate): Function removed.
* mpn/generic/toom_interpolate_7pts.c: New file.
* mpn/x86/k7/mmx/popham.asm: Table cycle counts.
* mpn/x86/k6/README: Update URLs.
* mpn/powerpc32/README: Update URL's, company names.
* mpn/generic/get_d.c: Complete rewrite.
* mpn/generic/mul_toom33.c: New file.
* mpn/generic/mul_toom22.c: Make orthogonal with other toomXY files.
* mpn/generic/mul_toom32.c: Likewise.
* mpn/generic/mul_toom42.c: Likewise.
* mpn/alpha/invert_limb.asm: Update cycle counts. Fix a comment typo.
* mpf/get_str.c: Include stdlib.h, not stdio.h for NULL.
* doc/gmp.texi: Fix a typo.
* memory.c (__gmp_default_allocate, __gmp_default_reallocate):
Cast size operands in error fprintf's.
* longlong.h (sub_ddmmss) [powerpc 64]: Add more variants for constant
* gmp-impl.h (udiv_qrnnd_preinv3): New define.
* gmp-impl.h (ULONG_PARITY): Exclude masquerading __INTEL_COMPILER from
ia64 asm.
* (mpn_neg_n): New function.
2007-09-18 Torbjorn Granlund <>
* demos/pexpr.c (main): Add -v option.
(enum op_t): New tag TIMING.
(mpz_eval_expr): Execute TIMING.
(fns): Add TIMING entry.
* gmp-impl.h: Add decls and THRESHOLDs for new toom multiplication
functions and division functions.
2007-09-10 Torbjorn Granlund <>
* mpn/powerpc32/addlsh1_n.asm: Use L() for labels.
* mpn/powerpc32/sublsh1_n.asm: Likewise.
2007-09-09 Torbjorn Granlund <>
* mpn/x86/x86-defs.m4 (LEA): New define.
* mpn/x86/darwin.m4: New file, for now just defining LEA.
* Pick up x86/darwin.m4.
* mpn/x86/*: Use LEA for PIC references.
* For X86/32, treat core2 like pentium3.
2007-09-06 Torbjorn Granlund <>
* tests/amd64check.c (calling_conventions_values): Put constants,
dynamic values in this array (was in scalars).
(calling_conventions_check): Corresponding changes.
* tests/amd64call.asm: Rewrite to be PIC, smaller, using amd64check.c's
2007-09-04 Torbjorn Granlund <>
* mpn/x86/pentium4/sse2/mul_basecase.asm: Misc cleanups.
* mpn/x86/pentium4/sse2/sqr_basecase.asm: Likewise.
* mpn/x86_64/mod_34lsub1.asm: Optimize loop, reduce code size.
* tests/amd64call.asm: Remove bogus no-op moves.
2007-09-03 Torbjorn Granlund <>
From Richard Guenther:
* (__GMP_EXTERN_INLINE): Declare conditionally on
* tests/cxx/ #include <cstdlib>, for abort.
* mpn/x86_64/core2/popcount.asm: New file.
* mpn/x86_64/pentium4/popcount.asm: New file.
* mpn/x86_64/addmul_2.asm: New file.
* mpn/x86_64/mul_2.asm: New file.
* mpn/x86_64/aorsmul_1.asm: Use 32-bit mov for zeroing registers
(saves space).
2007-09-01 Torbjorn Granlund <>
* Handle athlon64, core2, and pentium4 separately for
64-bit ABI.
* config.sub: Recog athlon64, core2, and opteron.
* config.guess: Do two x86 variants, for 32-bit ABI and 64-bit ABI.
Return "athlon64" and "core2", not x86_64.
2007-08-31 Torbjorn Granlund <>
From Patrick Pelissier:
* Don't refer to FILE from C++ unless we've seen FILE.
2007-08-30 Torbjorn Granlund <>
* demos/isprime.c: Include string.h for strcmp.
* demos/factorize.c (main): Declare to int.
2007-06-22 Torbjorn Granlund <>
* mpn/x86_64/pentium4/lshift.asm: Minor tuning.
* mpn/x86_64/pentium4/rshift.asm: Likewise.
2007-05-30 Torbjorn Granlund <>
* mpn/powerpc64/mode64/aors_n.asm: Add _nc entry points.
2007-05-22 Torbjorn Granlund <>
* tests/memory.c: Cast calls to new mem* calls to avoid unaligned ops.
2007-05-16 Torbjorn Granlund <>
* tests/mpz/convert.c: Tweak operand sizes for best coverage.
* tests/memory.c: Add red zones around allocations.
2007-05-15 Torbjorn Granlund <>
* mpn/ia64/mul_1.asm: Make mul_1c entry point actually work.
* mpn/generic/set_str.c (mpn_dc_set_str): Avoid calling mpn_add_n when
ln == 0.
* tests/mpz/convert.c (string_urandomb): New function.
(main): Use it by enabling ifdef'ed out code.
2007-04-30 Torbjorn Granlund <>
* mpn/x86_64/mul_basecase.asm: Complete rewrite.
* mpn/x86_64/copyi.asm: Use short shift-by-one form. Misc cleanups.
* mpn/x86_64/copyi.asm: Likewise.
* mpn/x86_64/popham.asm: Likewise.
* mpn/x86_64/aorsmul_1.asm: Cleanup formatting.
2007-04-25 Torbjorn Granlund <>
* mpz/divexact.c: Handle undefined case of |N| < |D| to avoid segfaults.
2007-02-24 Torbjorn Granlund <>
* doc/gmp.texi (Toom 3-Way Multiplication): Fix typo.
(mpz_scan0, mpz_scan1): Fix typos.
(Float Internals): Rewrite paragraph about struct types.
2007-02-12 Torbjorn Granlund <>
* mpn/x86/pentium4/sse2/sqr_basecase.asm: Complete rewrite (except
diagonal code).
2007-02-05 Torbjorn Granlund <>
* mpn/generic/mul_fft.c (mpn_fft_fft): New name for mpn_fft_fft_sqr,
old mpn_fft_fft removed.
(mpn_mul_fft_internal): Call mpn_fft_fft separately for each operand.
(mpn_fft_add_modF): Rewrite to avoid random branches.
(mpn_fft_sub_modF): Likewise.
* mpn/x86/pentium4/sse2/addmul_1.asm: Complete rewrite.
* mpn/x86/pentium4/sse2/mul_1.asm: Complete rewrite.
* mpn/x86/pentium4/sse2/mul_basecase.asm: Complete rewrite, based on
new addmul and mul code.
2007-01-31 Torbjorn Granlund <>
* mpn/generic/get_str.c (mpn_sb_get_str): Get loop count for frac
development right.
* mpn/powerpc32/vmx/mod_34lsub1.asm: New file.
* mpn/powerpc32/aors_n.asm: New file, complete rewrite.
* mpn/powerpc32/add_n.asm: Remove.
* mpn/powerpc32/sub_n.asm: Remove.
2007-01-25 Torbjorn Granlund <>
* mpn/x86_64/core2/aors_n.asm: Add _nc entry points, minor cleanups.
* mpn/x86_64/core2/lshift.asm: Rewrite.
* mpn/x86_64/core2/rshift.asm: Rewrite.
* mpn/x86_64/pentium4/lshift.asm: Swap some loop insns for a small
* mpn/x86_64/pentium4/rshift.asm: New file, based on lshift.asm.
* mpn/x86_64/pentium4/gmp-mparam.h: New file.
* mpn/x86_64/pentium4/aors_n.asm: Complete rewrite of add/subtract
* mpn/x86_64/pentium4/add_n.asm: Remove.
* mpn/x86_64/pentium4/sub_n.asm: Remove.
2007-01-20 Torbjorn Granlund <>
* mpn/x86_64/lshift.asm: Add special case for cnt=1.
2007-01-19 Torbjorn Granlund <>
* mpn/x86_64/aorsmul_1.asm: New file, written from scratch, finally at
3.0 c/l on K8 (addmul_1 was 3.3; submul_1 was 3.5).
* mpn/x86_64/addmul_1.asm: Remove.
* mpn/x86_64/submul_1.asm: Remove.
2006-12-29 Torbjorn Granlund <>
* randmt.c (__gmp_randclear_mt): Initialize ALLOC field, like in
(__gmp_randclear_mt, __gmp_randinit_mt_noseed): Make similar functions
look similar.
(__gmp_randclear_mt): Pass actually allocated size.
* mpn/ (nodist_EXTRA_libmpn_la_SOURCES): Add mul_toom22.c,
mul_toom32.c, mul_toom42.c.
* Recognize athlon64 and core2 as alternatives to x86_64.
Provide special settings for core2.
* (gmp_mpn_functions): Add mul_toom22, mul_toom32,
* mpn/generic/mul_toom22.c: New file.
* mpn/generic/mul.c: Use mpn_mul_toom22. Trim cutoff points between
the mpn_mul_toomN2 functions. Handle balanced operands at function
2006-12-29 Marco Bodrato <>
* mpn/generic/mul_n.c: Rewrite interpolation code.
2006-12-28 Torbjorn Granlund <>
* mpn/generic/mul_toom32.c: New file.
* mpn/generic/mul_toom42.c: New file.
* mpn/generic/mul.c: Use mpn_mul_toom32 and mpn_mul_toom42 for
unbalanced operands.
2006-12-17 Torbjorn Granlund <>
* mpn/x86_64/aorrlsh_n.asm: New file.
* mpn/x86_64/lshsub_n.asm: New file.
* mpn/x86_64/core2/aors_n.asm: New file.
* mpn/x86_64/core2/lshift.asm: New file.
* mpn/x86_64/core2/rshift.asm: New file.
* mpn/x86/p6/aors_n.asm: Replace K7 grabbing code with P6 specific
* mpn/x86/p6/lshsub_n.asm: New file.
2006-11-23 Torbjorn Granlund <>
* tune/speed.h (SPEED_ROUTINE_MPN_MUL_BASECASE): Allocate space for xp
locally, s->xp might be insufficient.
2006-11-22 Torbjorn Granlund <>
* randmt.c (__gmp_randinit_mt_noseed): Initialize ALLOC field of result
2006-11-06 Torbjorn Granlund <>
* tune/set_strp.c: New file.
2006-11-04 Torbjorn Granlund <>
* extract-dbl.c: Rewrite to handle nails better, and for general
* mpz/bin_uiui.c: Simplify.
* longlong.h (umul_ppmm) [mmix]: New.
* tune/tuneup.c, tune/common.c, tune/speed.c, tune/speed.h,
tune/set_strb.c, tune/set_strs.c: Add tuning and speed measurements
Add tuning and speed measurement of mpn_addsub_n.
2006-10-31 Torbjorn Granlund <>
* gmpxx.h: Remove ternary stuff, it is hardly an optimization and it
writes to destination before reading all source operands.
2006-10-25 Torbjorn Granlund <>
* mpn/generic/set_str.c: Complete rewrite.
* mpn/generic/get_str.c: Likewise.
* gmp-impl.h (struct powers, powers_t): New types.
Restructure GET_STR_* and SET_STR_* thresholds.
2006-09-21 Torbjorn Granlund <>
* mpn/generic/rootrem.c: Remove some redundant casts.
2006-07-12 Torbjorn Granlund <>
* mpn/alpha/ev6/nails/addmul_2.asm: Make it run at claimed speed.
* mpn/alpha/ev6/nails/addmul_4.asm: Likewise.
* mpf/get_str.c: Avoid copying result when not needed. Misc cleanups.
* tests/amd64call.asm: Use jmp instead of jmpq to placate Solaris.
2006-06-30 Torbjorn Granlund <>
* (powerpc-*): Remove repeated path component.
2006-06-15 Torbjorn Granlund <>
* (ia64-*-linux*): Don't use -O3.
2006-06-14 Torbjorn Granlund <>
* mpq/get_str.c: Fix upper base limit boundary in an ASSERT.
* tests/refmpn.c (refmpn_sb_divrem_mn): Use ASSERT_CARRY for add-back.
2006-05-31 Torbjorn Granlund <>
* tests/mpz/t-set_d.c (check_data): Add more data points.
* mpz/set_d.c: Handle negative return values from __gmp_extract_double.
2006-05-17 Torbjorn Granlund <>
* Clear out gcc_cflags_cpu and gcc_cflags_arch for a fat
2006-05-16 Torbjorn Granlund <>
* demos/primes.c (find_primes): Increase mpz_probab_prime_p cnt to 10.
* mpn/generic/addsub_n.c: Fix criteria form when to call _nc functions.
2006-05-12 Torbjorn Granlund <>
* config.guess: Recognize more ppc processor types.
2006-05-11 Torbjorn Granlund <>
* tune/speed.c (usage): Update URL for gnuplot and quickplot.
2006-05-10 Torbjorn Granlund <>
* (powerpc-*-*): Pass -maltivec to assembler for
appropriate CPUs.
2006-05-08 Torbjorn Granlund <>
* mpn/powerpc32/aix.m4 (LEA): Remove [RW] attribute.
2006-05-03 Torbjorn Granlund <>
* mpn/powerpc64/vmx/popcount.asm: Conditionally zero extend n.
2006-04-27 Torbjorn Granlund <>
* mpz/divexact.c: Call mpz_tdiv_q for large operands.
* (powerpc-*-darwin): Remove -fast, it affects PIC.
2006-04-26 Torbjorn Granlund <>
* config.guess: Try to recognize Ultrasparc T1 (as ultrasparct1).
* config.sub: Handle ultrasparct1.
2006-04-25 Torbjorn Granlund <>
* mpn/sparc64/gmp-mparam.h: Retune, without separation of GNUC and
non-GNUC data.
2006-04-20 Torbjorn Granlund <>
* tests/mpz/convert.c: Increase operands range.
2006-04-19 Torbjorn Granlund <>
* Support powerpc eABI.
* mpn/powerpc32/eabi.m4: New file.
* Support powerpc *bsd.
* mpn/powerpc64/elf.m4: New name for mpn/powerpc64/linux64.m4.
* mpn/powerpc32/elf.m4: New name for mpn/powerpc32/linux.m4.
* mpn/powerpc64/linux64.m4 (ASM_END): Quote TOC_ENTRY.
2006-04-18 Torbjorn Granlund <>
* (gmp_mpn_functions_optional): Add lshiftc.
(HAVE_NATIVE): Add lshiftc.
* mpn/powerpc64/mode64/invert_limb.asm: Use LEA, not LDSYM.
* mpn/powerpc64/mode64/mode1o.asm: Likewise.
* mpn/powerpc64/mode64/dive_1.asm: Likewise.
* mpn/powerpc64/linux64.m4 (TOC_ENTRY): Define to empty.
* mpn/powerpc64/aix.m4 (TOC_ENTRY): Likewise.
* mpn/powerpc32/aix.m4 (TOC_ENTRY): Likewise.
* mpn/powerpc32/aix.m4 (EXTERN): New, copied form powerpc64/aix.m4.
* mpn/powerpc32/mode1o.asm: Use EXTERN.
* mpn/powerpc32/linux.m4 (EXTERN): Provide dummy definition.
* mpn/powerpc32/darwin.m4 (EXTERN): Likewise.
2006-04-13 Torbjorn Granlund <>
* mpn/generic/mul_fft.c: Use new thresholds mechanism if MUL_FFT_TABLE2
is defined.
(mpn_lshiftc): New name for mpn_lshift_com (for consistency with some
stuff already in 4.1.4.
(mpn_fft_mul_2exp_modF): Reorganize initial operand reductions to avoid
* tests/devel/try.c (choice_array): Add mpn_addsub_n[c].
2006-04-11 Torbjorn Granlund <>
* aclocal.m4: Regenerate with patched libtool.
* mpn/asm-defs.m4 (ASM_END): Provide (empty) default.
2006-04-08 Torbjorn Granlund <>
* (gmp_mpn_functions_optional): Add addsub.
* gmpxx.h: Remove missed MPFR references.
* gmp-impl.h (LIMBS_PER_DOUBLE): Adjust formula to not be pessimistic.
* gmp-impl.h (TMP_*, WANT_TMP_DEBUG): Don't expect marker argument;
* mpn/minithres/gmp-mparam.h: New file.
* tests/mpz/t-io_raw.c: Fix printf type/arg mismatches.
* tests/mpz/t-export.c: Likewise.
* tests/mpz/io.c: Likewise.
* tests/t-constants.c: Likewise.
* mpn/ia64/popcount.asm: Append "cond.dptk" to conditional branches to
placate icc.
* mpn/ia64/hamdist.asm: Likewise.
* mpn/ia64/lorrshift.asm: Likewise.
* mpn/ia64/dive_1.asm: Likewise.
2006-04-05 Torbjorn Granlund <>
* tal-notreent.c (__gmp_tmp_mark): Add "struct" tag for tmp_marker.
(__gmp_tmp_free): Likewise.
* mpn/generic/mul_fft.c: Optimize many scalar divisions and mod
operations into masks and shifts.
(mpn_fft_mul_modF_K): Fix a spurious ASSERT_NOCARRY.
2006-03-26 Torbjorn Granlund <>
* Version 4.2 released.
* mpn/powerpc64/aix.m4 (LEA): Renamed from LDSYM.
* mpn/powerpc64/darwin.m4: Likewise.
* mpn/powerpc64/linux64.m4: Likewise.
* mpn/powerpc64/vmx/popcount.asm: Use LEA, not LDSYM.
2006-03-23 Torbjorn Granlund <>
* gmp-impl.h: (class gmp_allocated_string): Prefix strlen with std::.
* gmpxx.h (__GMP_DEFINE_TERNARY_EXPR2): Remove for now.
(struct __gmp_ternary_addmul2): Likewise.
(struct __gmp_ternary_submul2): Likewise.
* gmpxx.h: #include <cstring>.
(struct __gmp_alloc_cstring): Prefix strlen with std::.
* mpn/x86/pentium/com_n.asm: Add TEXT and ALIGN.
* mpn/x86/pentium/copyi.asm: Likewise.
* mpn/x86/pentium/copyd.asm: Likewise.
2006-03-22 Torbjorn Granlund <>
* Add a "using std::FILE" for C++.
* gmpxx.h: Remove mpfr code.
* tests/cxx: Likewise.
* gmp-impl.h (FORCE_DOUBLE): Rename a tempvar to avoid a clash with
GNU/Linux public include file.
* (powerpc64, darwin): New optional, gcc_cflags_subtype.
Grab powerpc32/darwin.m4 for ABI=mode32.
* Use host_cpu whenever just the cpu type is needed.
2006-03-08 Torbjorn Granlund <>
* mpz/get_si.c: Fix a typo.
* tests/mpq/t-get_d.c (check_random): Improve random generation for
2006-02-28 Torbjorn Granlund <>
* tests/mpq/t-get_d.c (check_random): New function.
(main): Call check_random.
* mpq/set_d.c: Make choices based on LIMBS_PER_DOUBLE, not
BITS_PER_MP_LIMB. Make it work for LIMBS_PER_DOUBLE == 4.
* mpz/set_d.c: Make it work for LIMBS_PER_DOUBLE == 4.
* extract-dbl.c: Make it work for LIMBS_PER_DOUBLE > 3.
2006-02-27 Torbjorn Granlund <>
* mpz/cmp_d.c: Declare `i'.
* mpz/cmpabs_d.c: Likewise.
2006-02-23 Torbjorn Granlund <>
* mpn/powerpc32/vmx/copyd.asm: Set right VRSAVE bits.
* mpn/powerpc32/vmx/copyi.asm: Likewise.
2006-02-22 Torbjorn Granlund <>
* mpn/powerpc32/vmx/logops_n.asm: New file.
* mpn/powerpc32/diveby3.asm: Rewrite.
2006-02-21 Torbjorn Granlund <>
* mpn/powerpc32/vmx/copyi.asm: New file.
* mpn/powerpc32/vmx/copyd.asm: New file.
2006-02-17 Torbjorn Granlund <>
* mpn/alpha/ev6/nails/aors_n.asm (CYSH): Import proper setting from
deleted mpn_sub_n.
2006-02-16 Torbjorn Granlund <>
* mpn/alpha/ev6/addmul_1.asm: Correct slotting comments.
2006-02-15 Torbjorn Granlund <>
* tests/devel/anymul_1.c: Copy error reporting code from addmul_N.c.
* tests/devel/addmul_N.c: New file.
* tests/devel/mul_N.c: New file.
* mpn/alpha/default.m4 (PROLOGUE_cpu): Align functions at 16-byte
* mpn/alpha/ev6/nails/aors_n.asm: New file.
* mpn/alpha/ev6/nails/add_n.asm: Remove.
* mpn/alpha/ev6/nails/sub_n.asm: Remove.
* mpn/alpha/ev6/nails/addmul_1.asm: Rewrite.
* mpn/alpha/ev6/nails/submul_1.asm: Likewise.
* mpn/alpha/ev6/nails/mul_1.asm: Likewise.
* mpn/alpha/ev6/nails/addmul_2.asm: Use L() for labels.
* mpn/alpha/ev6/nails/addmul_3.asm: Use L() for labels.
* mpn/alpha/ev6/nails/addmul_4.asm: Use L() for labels.
2006-02-13 Torbjorn Granlund <>
* mpn/powerpc32/diveby3.asm: Trivially reorder loop insns to save
1 c/l.
* mpn/x86_64/dive_1.asm: Use movabsq to support large model non-PIC.
* mpn/x86_64/rsh1add_n.asm: Replace high register with rbx.
* mpn/x86_64/rsh1sub_n.asm: Likewise.
2006-02-10 Torbjorn Granlund <>
* mpn/powerpc64/sqr_diagonal.asm: Software pipeline.
* mpn/powerpc64/vmx/popcount.asm: Add prefetching.
2006-02-07 Torbjorn Granlund <>
* mpn/powerpc64/mode64/diveby3.asm: Rewrite.
2006-02-04 Torbjorn Granlund <>
* mpn/powerpc64/vmx/popcount.asm: Remove mpn_hamdist partial code.
Move compare for huge n so that it is always executed.
2006-02-03 Torbjorn Granlund <>
* mpn/powerpc32/linux.m4 (LEA): Add support for PIC.
* (powerpc): New optional, gcc_cflags_subtype.
* mpn/x86_64/pentium4/add_n.asm: New file.
* mpn/x86_64/pentium4/sub_n.asm: New file.
* mpn/x86_64/pentium4/lshift.asm: New file.
* mpn/powerpc64/linux64.m4 (PROLOGUE_cpu): Align function start to
* mpn/powerpc64/aix.m4: Likewise.
* mpn/powerpc64/darwin.m4: Likewise.
* mpn/powerpc64/copyi.asm: Align loop to 16-multiple.
* mpn/powerpc64/copyd.asm: Likewise
* (powerpc): Add vmx to relevant paths.
* mpn/powerpc64/linux64.m4 (DEF_OBJECT): Accept 2nd argument, for
* mpn/powerpc64/aix.m4: Likewise.
* mpn/powerpc64/darwin.m4: Likewise.
* mpn/powerpc32/linux.m4 (DEF_OBJECT, END_OBJECT): New macros,
inherited from powerpc64 versions.
* mpn/powerpc32/aix.m4: Likewise.
* mpn/powerpc32/darwin.m4: Likewise.
* mpn/powerpc64/vmx/popcount.asm: New file, for ppc32 and ppc64.
* mpn/powerpc32/vmx/popcount.asm: New file, grabbing above file.
2006-01-22 Torbjorn Granlund <>
* Generalize OS-dependent patterns for powerpcs.
2006-01-20 Torbjorn Granlund <>
* mpn/x86_64/popham.asm: Optimize.
* config.guess: Recognize power4 and up under linux-gnu.
* config.sub: Generalize power recognition code.
* acinclude.m4 (POWERPC64_PATTERN): Add 64-bit powerpc processors.
* Recognize powerpc processors masquerading as power
2006-01-19 Torbjorn Granlund <>
* mpn/x86_64/logops_n.asm: Rewrite for more stable speed and smaller
* mpn/x86_64/com_n.asm: Likewise.
2006-01-18 Torbjorn Granlund <>
* mpn/x86_64/addlsh1_n.asm: Rewrite to use indexed addressing.
* mpn/x86_64/sublsh1_n.asm: Likewise.
2006-01-17 Torbjorn Granlund <>
* mpn/generic/diveby3.c: Use GMP standard parameter names. Nailify
alternative code. Use restrict for params.
* Recognize andn_n as not needing nailification.
* tests/mpq/t-equal.c (check_various): Disable a test that gives common
factors for GMP_NUMB_BITS == 62.
2006-01-16 Torbjorn Granlund <>
* mpn/generic/get_str.c (mpn_sb_get_str): Fix digit count computation,
was inaccurate for nails.
2006-01-15 Torbjorn Granlund <>
* mpn/x86_64/mode1o.asm: Remove unneeded carry register zeroing.
2006-01-08 Torbjorn Granlund <>
* mpn/alpha/ev6/sqr_diagonal.asm: New file.
2006-01-06 Torbjorn Granlund <>
* mpn/powerpc64/mode64/mod_34lsub1.asm: Tune to 1.5 c/l.
* mpn/generic/mullow_n.c (MUL_BASECASE_ALLOC): New #define.
(mpn_mullow_n): Use it.
* mpn/powerpc64/mode64/dive_1.asm: Use EXTERN.
* mpn/powerpc64/mode64/mode1o.asm: Likewise.
* mpn/powerpc64/aix.m4 (EXTERN): Define to import symbol.
(LDSYM): Remove [RW] attribute.
* mpn/powerpc64/linux64.m4 (EXTERN): Dummy definition.
* mpn/powerpc64/darwin.m4 (EXTERN): Likewise.
2006-01-05 Torbjorn Granlund <>
* mpn/powerpc64/mode64/mode1o.asm: New file.
* mpn/powerpc64/mode64/dive_1.asm: Use L() for labels. Invoke ASM_END.
* mpn/powerpc64/mode64/invert_limb.asm: Invoke ASM_END.
* mpn/powerpc64/linux64.m4: Move toc entry generation from direct at
DEF_OBJECT to delayed via LDSYM, define ASM_END to output it.
* mpn/powerpc64/aix.m4: Likewise.
* mpn/powerpc64/darwin.m4: Define a dummy ASM_END.
* mpn/powerpc64/mode64/addmul_1.asm: Add POWER5 timings.
* mpn/powerpc64/mode64/mul_1.asm: Likewise.
* mpn/powerpc64/mode64/submul_1.asm: Tweak to save 1.5 c/l for POWER5.
2006-01-04 Torbjorn Granlund <>
* mpn/powerpc64/mode64/dive_1.asm: New file.
* mpn/powerpc64/mode64/invert_limb.asm: Add missing ASM_START.
* mpn/powerpc64/mode64/addmul_1.asm: Fix a comment typo.
* mpn/x86_64/diveby3.asm: Rewrite.
2006-01-03 Torbjorn Granlund <>
* Update bugs reporting address.
* mpn/powerpc64/mode64/diveby3.asm: Trim a cycle off of POWER4 timing.
Misc cleanup.
2006-01-02 Torbjorn Granlund <>
* mpn/powerpc64/linux64.m4 (CALL): New macro.
* mpn/powerpc64/aix.m4: Likewise.
* mpn/powerpc64/darwin.m4: Likewise, also define macro "DARWIN".
2005-12-28 Torbjorn Granlund <>
* mpn/powerpc64/mode64/mod_34lsub1.asm: New file.
2005-12-26 Torbjorn Granlund <>
* mpn/x86_64/mod_34lsub1.asm: New file.
2005-12-20 Torbjorn Granlund <>
* mpn/x86_64/submul_1.asm: Save a push/pop by not using register r12.
Use addq instead of leaq for pointer updates; schedule them. (These
changes shaves one cycle of overhead and 0.25 c/l.)
2005-12-18 Torbjorn Granlund <>
* mpf/ui_div.c: Implement workaround for GCC bug triggered on alpha.
* mpf/set_q.c: Likewise.
2005-12-16 Torbjorn Granlund <>
* mpn/generic/tdiv_qr.c: Remove statement with no effect.
Rename dead variable to `dummy'.
2005-12-15 Torbjorn Granlund <>
* demos/pexpr.c (setup_error_handler): Add a missing ";".
2005-11-27 Torbjorn Granlund <>
* mpn/generic/mul.c: Crudely call mpn_mul_fft_full before checking
for unbalanced operands.
* mpn/generic/mul_fft.c: Remove many scalar divisions.
(mpn_mul_fft_lcm): Simplify.
(mpn_mul_fft_decompose): Rewrite to handle arbitrarily unbalanced
2005-11-22 Torbjorn Granlund <>
* Properly recognize all 32-bit Solaris releases.
2005-11-10 Torbjorn Granlund <>
* mpn/generic/mul_fft.c: Inline mpn_fft_mul_2exp_modF,
mpn_fft_add_modF and mpn_fft_normalize.
2005-11-02 Torbjorn Granlund <>
* tests/mpz/reuse.c: Increase operand size, decrease # of reps.
* mpz/rootrem.c: Adapt to new mpn_rootrem.
* mpz/root.c: Likewise.
* tests/mpz/reuse.c: Test mpz_rootrem.
With Paul Zimmermann:
* mpn/generic/rootrem.c: Complete rewrite.
2005-10-31 Torbjorn Granlund <>
* mpz/pprime_p.c (mpz_probab_prime_p): Considerably limit trial
* mpz/perfpow.c (mpz_perfect_power_p): Use mpz_divisible_ui_p instead
of mpz_tdiv_ui.
* mpz/divegcd.c: Correct probability number for GCD == 1.
* mpn/x86_64/mul_basecase.asm: Remove an obsolete comment.
* mpn/x86: Add cycle counts for array of x86 processors.
* mpn/x86/k7/mod_34lsub1.asm: Remove spurious mentions of ebp.
* mpn/powerpc32: Add POWER5 timings.
* mpn/powerpc32/README: Describe global reference variations.
* mpn/ia64/divrem_2.asm: Add some comments.
* mpn/ia64/divrem_1.asm: Reformat.
* mpn/ia64/addmul_2.asm: Correct a comment on slotting.
* mpn/ia64/logops_n.asm: Likewise.
* mpn/ia64/addmul_1.asm: Remove a redundant preg mutex decl.
* mpn/generic/dive_1.c: Whitespace cleanup.
* mpn/alpha/ev6/nails/addmul_1.asm: Correct comments on slotting.
* mpn/alpha/ev6/nails/addmul_2.asm: Likewise.
* mpn/alpha/ev6/nails/addmul_4.asm: Likewise.
* mpf/out_str.c: List some allocation improvement ideas.
* doc/gmp.texi: Update many URLs and email addresses.
2005-10-26 Torbjorn Granlund <>
* tune/tuneup.c (tune_mullow): Update param.max_size for each threshold
* (POWERPC64_PATTERN/*-*-darwin*): Set
SPEED_CYCLECOUNTER_OBJ_mode64 and cyclecounter_size_mode64.
(POWERPC64_PATTERN/*-*-linux*): Likewise.
2005-10-03 Torbjorn Granlund <>
* demos/factorize.c (factor_using_division_2kp): Honor verbose flag.
(factor_using_pollard_rho): Divide out new factor before it's
clobbered. Don't stop factoring after a composite factor was found.
2005-09-17 Torbjorn Granlund <>
* demos/pexpr.c (fns): Add factorial keywords.
2005-08-16 Torbjorn Granlund <>
* tune/ (EXTRA_DIST): Change "amd64" => "x86_64".
* mpn/ (TARG_DIST): Change "amd64" => "x86_64".
2005-08-15 Torbjorn Granlund <>
* Change "amd64" => "x86_64".
2005-06-13 Torbjorn Granlund <>
* mpn/generic/pre_mod_1.c: Canonicalize variable names.
* mpn/generic/divrem.c: Rate qxn test as UNLIKELY.
* mpn/generic/gcdext.c (sanity_check_row): Invoke TMP_MARK.
* tune/tuneup.c (tune_mullow): Fix all max_size fields.
* gmp-impl.h (SQR_TOOM3_THRESHOLD_LIMIT): New #define.
* tune/tuneup.c (tune_sqr): Use SQR_TOOM3_THRESHOLD_LIMIT.
(sqr_toom3_threshold): Initialize from SQR_TOOM3_THRESHOLD_LIMIT.
* mpn/generic/mul_n.c (mpn_sqr_n): Use SQR_TOOM3_THRESHOLD_LIMIT.
* gmp-impl.h (mpn_nand_n, mpn_iorn_n, mpn_nior_n, mpn_xnor_n):
Handle nails.
2005-06-13 Niels Möller <>
* mpn/generic/gcdext.c (gcdext_schoenhage): Check for the
(unlikely) case that one of the hgcd/euclid steps results in two
remainders of one limb each. Then use gcdext_1.
2005-06-12 Torbjorn Granlund <>
* mpn/alpha/ev6/sub_n.asm: Analogous changes as to add_n.asm last.
2005-06-11 Torbjorn Granlund <>
* mpn/alpha/ev6/add_n.asm: Rewrite inner loop to load later.
Add mpn_add_nc entry.
* mpn/alpha/ev6/addmul_1.asm: Remove redundant initial loads.
2005-06-09 Torbjorn Granlund <>
* mpn/ia64/dive_1.asm: Fix issues with HP-UX.
2005-06-08 Torbjorn Granlund <>
* mpn/ia64/diveby3.asm: Update TODO list.
* mpn/ia64/mode1o.asm: Fix comment typos.
* mpn/ia64/dive_1.asm: New file.
2005-06-07 Torbjorn Granlund <>
* mpn/ia64/mode1o.asm: Add prefetching.
* mpn/generic/dive_1.c: Use variable h for upper umul_ppmm result.
2005-06-06 Torbjorn Granlund <>
* mpn/ia64/hamdist.asm: Complete rewrite.
* mpn/ia64/popcount.asm: Rewrite to use multi-pronged feed-in.
* mpn/ia64/aors_n.asm: Rewrite feed-in code.
* mpn/ia64/rsh1aors_n.asm: Likewise.
* mpn/ia64/aorslsh1_n.asm: Likewise.
* mpn/ia64/lorrshift.asm: Likewise.
2005-06-04 Torbjorn Granlund <>
* tests/devel/try.c (choice_array): Exclude mpn_preinv_mod_1 unless
(choice_array): Exclude mpn_sqr_basecase if SQR_KARATSUBA_THRESHOLD
is zero.
2005-06-03 Torbjorn Granlund <>
* mpn/alpha/ev6/addmul_1.asm: Prefix all labels with "$".
* mpn/alpha/ev6/mul_1.asm: Likewise.
2005-06-02 Torbjorn Granlund <>
* tests/refmpn.c (refmpn_divmod_1c_workaround): Implement workaround
to gcc 3.4.x bug triggered on powerpc64 with 32-bit ABI.
2005-06-01 Torbjorn Granlund <>
* tests/devel/try.c (main): Fix a typo.
2005-05-31 Torbjorn Granlund <>
* mpn/alpha/ev6/addmul_1.asm: Rewrite for L1 cache, add prefetch.
2005-05-30 Torbjorn Granlund <>
* tests/misc.c (tests_rand_start): Mask random seed to 32 bits.
2005-05-29 Torbjorn Granlund <>
* mpn/powerpc64/mode32/mul_1.asm: Handle BROKEN_LONGLONG_PARAM.
* mpn/powerpc64/mode32/addmul_1.asm: Likewise.
* mpn/powerpc64/mode32/submul_1.asm: Likewise.
* mpn/powerpc32/mode1o.asm: Rewrite to actually work.
* mpn/powerpc32/aix.m4 (LEA): New macro.
(ASM_END): New macro.
* mpn/powerpc32/linux.m4: New file.
* mpn/powerpc32/darwin.m4: New file.
* Use linux.m4 and darwin.m4.
(powerpc64-linux-gnu): Add support for mode32.
2005-05-25 Torbjorn Granlund <>
* mpn/generic/mullow_n.c: Remove FIXME mentioning fixed flaw.
* tests/mpz/t-cmp_d.c (check_one): Fix printf fmt string typo.
* demos/isprime.c: #include stdlib.h.
* tests/rand/t-urbui.c: Likewise.
* tests/rand/t-urmui.c: Likewise.
* tests/mpz/t-popcount.c (check_random): Remove spurious printf arg.
* mpn/ia64/lorrshift.asm: Cleanup code layout.
* mpn/ia64/popcount.asm: Likewise.
2005-05-24 Torbjorn Granlund <>
* tests/devel/try.c (param_init) [TYPE_GET_STR]: Set retval field.
(compare): Handle SIZE_GET_STR as SIZE_RETVAL.
* tests/refmpn.c (refmpn_get_str): Rewrite to make it work.
2005-05-23 Torbjorn Granlund <>
* mpn/amd64/add_n.asm: Add mpn_add_nc entry point.
* mpn/amd64/sub_n.asm: Add mpn_sub_nc entry point.
* longlong.h (many places): Remove lvalue casts.
* gmp-impl.h (MPF_SIGNIFICANT_DIGITS): Cast prec to avoid overflow
for > 4G digits.
* mpn/alpha/ev6/add_n.asm: Prefetch using ldl.
* mpn/alpha/ev6/sub_n.asm: Likewise.
* mpn/alpha/ev6/ (optable): Recognize negq and ldl.
* mpn/ia64/aors_n.asm: Prefetch using lfetch.
* mpn/ia64/lorrshift.asm: Likewise.
* mpn/ia64/popcount.asm: Likewise.
* mpn/ia64/diveby3.asm: Likewise.
2005-05-22 Torbjorn Granlund <>
* mpn/alpha/ev67/popcount.asm: Prefetch.
* mpn/alpha/ev67/hamdist.asm: Prefetch.
* longlong.h (add_ssaaaa) [x86]: Remove lvalue casts.
(sub_ddmmss) [x86]: Likewise.
* tests/devel/try.c (param_init) [TYPE_MPZ_JACOBI]: Add DATA_SRC1_ODD.
(param_init) [TYPE_MPZ_KRONECKER]: Clear inherited DATA_SRC1_ODD.
(param_init) [TYPE_DIVEXACT_1]: Use symbolic name DIVISOR_LIMB.
2005-05-21 Torbjorn Granlund <>
* tests/devel/try.c (param_init) [TYPE_MPZ_JACOBI]: Initialize divisor
field according to UDIV_NEEDS_NORMALIZATION.
* mpz/mul_i.h: Remove left-over TMP_XXXX marker arguments.
2005-05-20 Torbjorn Granlund <>
* mpn/x86/pentium4/sse2/addmul_1.asm (mpn_addmul_1c): Put carry in
proper register.
* mpn/generic/sqr_basecase.c (mpn_sqr_basecase, addmul_2 version):
Avoid accesses out-of-bound in MPN_SQR_DIAGONAL applicate code.
2005-05-19 Torbjorn Granlund <>
* mpn/alpha/diveby3.asm: Make it actually work.
* gmp-impl.h (MULLOW_BASECASE_THRESHOLD_LIMIT): New #define.
* mpn/generic/mullow_n.c: Use fixed stack allocation for the smallest
operands; use TMP_S* allocation for medium operands.
* gmp-impl.h: Remove nested TUNE_PROGRAM_BUILD test.
2005-05-18 Torbjorn Granlund <>
* mpn/generic/mul_n.c: Make squaring and multiplication code more
similar. Use TMP_S* functions.
* gmp-impl.h (TMP_DECL, TMP_MARK, TMP_FREE): Get rid of argument.
(TMP_SALLOC): New macro for "small" allocations.
(TMP_BALLOC): New macro for "big" allocations.
(TMP_SDECL, TMP_SMARK, TMP_SFREE): New macros for functions that use
(WANT_TMP_ALLOCA): Make default functions choose alloca or reentrant
functions, depending on size.
* *.c: Remove TMP_XXXX marker arguments.
* acinclude.m4 (WANT_TMP): Want tal-reent.lo also for alloca case.
2005-05-16 Torbjorn Granlund <>
* mpn/ia64/gmp-mparam.h: Further extend FFT tables.
2005-05-15 Torbjorn Granlund <>
* gmp-impl.h (udiv_qrnnd_preinv2): Pull an add into add_ssaaaa.
(udiv_qrnnd_preinv2gen): Likewise.
2005-05-14 Torbjorn Granlund <>
* longlong.h (add_ssaaaa) [x86_64]: Restrict allowed immediate
* (sub_ddmmss) [x86_64]: Likewise.
2005-05-02 Torbjorn Granlund <>
* acinclude.m4 (GMP_HPC_HPPA_2_0): Make gmp_tmp_v1 sed pattern handle
version numbers like B.11.X.32509-32512.GP.
* mpn/m68k/aors_n.asm: Correct MULFUNC_PROLOGUE.
* mpn/powerpc64/mode64/aors_n.asm: Add a MULFUNC_PROLOGUE.
* mpf/inp_str.c: Use plain int for mpf_set_str return value (works
around gcc 4 bug).
* acinclude.m4 (GMP_ASM_POWERPC_PIC_ALWAYS): Handle darwin's assembly
(long long reliability test 1): New GMP_PROG_CC_WORKS_PART test.
(long long reliability test 2): New GMP_PROG_CC_WORKS_PART test.
* Add mode64 support for darwin. Use darwin.m4.
Add cflags_opt flags for mode32 darwin.
* mpn/powerpc64: Use L() for all asm files.
* mpn/asm-defs.m4 (PIC_ALWAYS): Define PIC just iff PIC_ALWAYS = "yes".
* mpn/powerpc64/darwin.m4: New file.
* mpn/powerpc64/linux64.m4: Remove TOCREF, add LDSYM.
Rework DEF_OBJECT to need just one argument.
* mpn/powerpc64/aix.m4: Likewise.
* mpn/powerpc64/mode64/invert_limb.asm: Load approx_tab address with
LDSYM. Optimize somewhat. Remove 2nd DEF_OBJECT operand.
2005-05-01 Torbjorn Granlund <>
* mpn/generic/popham.c: Compute final summation differently for 64-bit.
* tests/mpz/t-popcount.c (check_random): New function.
(main): Call it.
2005-04-28 Torbjorn Granlund <>
* mpn/amd64/add_n.asm: Use r9 instead of rbx to save push/pop.
* mpn/amd64/sub_n.asm: Likewise.
2005-04-09 Torbjorn Granlund <>
* mpn/powerpc64/copyi.asm: If HAVE_ABI_mode32, ignore upper 32 bits of
mp_size_t argument.
* mpn/powerpc64/copyd.asm: Likewise.
* mpn/powerpc64/sqr_diagonal.asm: Likewise.
* mpn/powerpc64/lshift.asm: Likewise.
* mpn/powerpc64/rshift.asm: Likewise.
* mpn/powerpc64/logops_n.asm: Likewise.
* mpn/powerpc64/com_n.asm: Likewise.
2005-04-08 Torbjorn Granlund <>
* mpn/generic/rootrem.c: Allocate PP_ALLOC limbs also for qp.
2005-04-07 Torbjorn Granlund <>
* mpn/powerpc32/add_n.asm: Add nc entry point.
* mpn/powerpc32/sub_n.asm: Likewise.
* mpn/amd64/*.asm: Add Prescott/Nocona cycle/limb numbers.
* mpn/alpha/add_n.asm: Add correct cycle/limb numbers.
* mpn/alpha/sub_n.asm: Likewise.
* mpn/alpha/ev5/add_n.asm: Likewise.
* mpn/alpha/ev5/sub_n.asm: Likewise.
2005-03-31 Torbjorn Granlund <>
* mpn/x86/k7/gmp-mparam.h: Fix typo in last change.
2005-03-19 Torbjorn Granlund <>
* mpn/amd64/gmp-mparam.h: Update.
* mpn/alpha/gmp-mparam.h: Update.
* mpn/alpha/ev5/gmp-mparam.h: Update.
* mpn/alpha/ev6/gmp-mparam.h: Update.
* mpn/ia64/gmp-mparam.h: Update.
* mpn/x86/p6/mmx/gmp-mparam.h: Update.
* mpn/x86/pentium4/sse2/gmp-mparam.h: Update.
* mpn/x86/k7/gmp-mparam.h: Update.
* tests/mpz/t-gcd.c (main): Honor command line reps argument.
* tune/speed.h (SPEED_ROUTINE_MPN_GCD_CALL): Simplify and correct code
for generating test operands.
2005-03-17 Niels Möller <>
* mpn/generic/hgcd.c (qstack_adjust): New argument d, saying how much
to adjust the top quotient.
(hgcd_adjust): The quotient can be off by either 1 or 2.
2005-03-16 Torbjorn Granlund <>
* tests/mpz/t-gcd.c (MAX_SCHOENHAGE_THRESHOLD): Set to largest of
gcd,gcdext thresholds.
2005-03-15 Niels Möller <>
* mpn/generic/gcdext.c (gcdext_schoenhage): When calling gcdext_lehmer,
reuse all temporary limb storage, including the storage used for the
2005-03-09 Torbjorn Granlund <>
* mpn/amd64/logops_n.asm: Add MULFUNC_PROLOGUE.
2005-03-05 Torbjorn Granlund <>
* mpn/amd64/gmp-mparam.h: Extend MUL_FFT_TABLE and SQR_FFT_TABLE.
* mpn/ia64/gmp-mparam.h: Likewise.
2005-02-17 Torbjorn Granlund <>
* mpn/ia64/divrem_1.asm: Add preinv entry point.
2005-01-13 Torbjorn Granlund <>
* gmp-impl.h (MPN_SIZEINBASE): Count bits in type size_t.
(MPN_SIZEINBASE_16): Likewise.
2004-12-17 Torbjorn Granlund <>
* tune/speed.c (run_gnuplot): Use lines, not linespoints.
Output a reset gnuplot command initially.
2004-12-04 Torbjorn Granlund <>
* mpn/generic/random2.c (gmp_rrandomb): Rework again.
* mpz/rrandomb.c (gmp_rrandomb): Likewise.
* mpn/amd64/redc_1.asm: Call via PLT when PIC.
2004-11-29 Torbjorn Granlund <>
* mpn/amd64/divrem_1.asm: Add preinv entry point.
* mpn/amd64/gmp-mparam.h: Set USE_PREINV_DIVREM_1 to 1.
2004-11-24 Torbjorn Granlund <>
* mpn/alpha/diveby3.asm: Use correct prefetch instruction.
2004-11-19 Torbjorn Granlund <>
* mpn/alpha/diveby3.asm: Add ",gp" glue in PROLOGUE.
Add r31 dummy operand to `br' instruction.
2004-11-17 Torbjorn Granlund <>
* mpn/powerpc64/mode64/addmul_1.asm: Rewrite.
* mpn/powerpc64/mode64/mul_1.asm: Rewrite.
2004-11-16 Torbjorn Granlund <>
* mpn/alpha/diveby3.asm: New file.
2004-11-13 Torbjorn Granlund <>
* mpn/amd64/popham.asm: New file.
2004-11-12 Torbjorn Granlund <>
* mpn/amd64/add_n.asm: Correct cycle count.
* mpn/amd64/sub_n.asm: Likewise.
* mpn/amd64/dive_1.asm: Speed divisors with many factors of 2.
2004-11-11 Torbjorn Granlund <>
* mpn/amd64/dive_1.asm: New file.
2004-11-10 Torbjorn Granlund <>
* mpn/generic/popham.c: Add comment.
2004-11-09 Torbjorn Granlund <>
* mpn/amd64/com_n.asm: New file.
* mpn/amd64/logops_n.asm: New file.
2004-11-08 Torbjorn Granlund <>
* mpn/powerpc64/com_n.asm: New file.
2004-11-05 Torbjorn Granlund <>
* mpn/amd64/diveby3.asm: New file.
* config.guess: Strip any PPC string in /proc/cpuinfo.
Recognize 970 in that code.
2004-11-01 Torbjorn Granlund <>
* mpn/amd64/mul_basecase.asm: New file.
* mpn/amd64/redc_1.asm: New file.
2004-10-25 Torbjorn Granlund <>
* mpn/powerpc64/mode64/addlsh1_n.asm: Correct cycle counts.
* mpn/powerpc64/README: Update POWER5/PPC970 pipeline information.
* mpn/generic/mul_basecase.c (MAX_LEFT): Add comment.
* doc/gmp.texi: Consistently use "x86" denotation.
(Assembler SIMD Instructions): Mention SSE2 usage.
* demos/pexpr.c (main): Handle "negative" base in mpz_sizeinbase call.
2004-10-18 Torbjorn Granlund <>
* mpn/powerpc64/mode64/submul_1.asm: Shave 2 cycles/limb with new carry
inversion trick.
2004-10-16 Torbjorn Granlund <>
* Support icc under x86.
(ia64-*-linux*): Pass -no-gcc to icc.
2004-10-15 Torbjorn Granlund <>
* longlong.h (ia64 umul_ppmm): Add version for icc.
* Support icc under ia64-*-linux*.
* acinclude.m4: New "compiler works" test for icc 8.1 bug.
(GMP_PROG_CC_IS_GNU): Don't let Intel's icc fool us it is GCC.
2004-10-14 Torbjorn Granlund <>
* mpn/generic/gcdext.c: Add a few missing TMP_MARK.
2004-10-14 Torbjorn Granlund <>
* acinclude.m4 (GMP_ASM_W32): Try also "data4".
* mpn/ia64/logops_n.asm: Don't use naked "br", rejected by Intel
* mpn/ia64/aors_n.asm: Likewise.
* mpn/ia64/divrem_2.asm: Add ".prologue".
* mpn/ia64/hamdist.asm: Put alloc first in bundle, enforced by the
Intel assembler.
* longlong.h: Exclude masquerading __INTEL_COMPILER from ia64 asm.
* gmp-impl.h: Likewise.
2004-10-12 Torbjorn Granlund <>
* mpn/ia64/mul_2.asm: Rewrite function entry code, write new code for
* mpn/ia64/addmul_2.asm: Likewise.
* tests/devel/try.c: Handle mpn_mul_2 like mpn_addmul_2.
* tune/speed.c (routine): Make R parameter optional for mpn_mul_2.
2004-10-11 Torbjorn Granlund <>
* mpn/sparc64/addmul_1.asm: Update a comment.
* tests/devel/aors_n.c: #include tests.h.
* tests/devel/anymul_1.c: Likewise.
* tests/devel/shift.c: Likewise.
* tests/devel/copy.c: Likewise.
* tests/devel/aors_n.c: Handle also mpn_addlsh1_n, mpn_sublsh1_n,
mpn_rsh1add_n, and mpn_rsh1sub_n.
* mpn/ia64/submul_1.asm: Add TODO item.
* mpn/ia64/aors_n.asm: Rewrite function entry code (again).
* mpn/ia64/aorslsh1_n.asm: Likewise.
* mpn/ia64/logops_n.asm: Likewise.
* mpn/ia64/rsh1aors_n.asm: Tune function entry and feed-in code.
* mpn/ia64/lorrshift.asm: Likewise. Remove several spurious loads.
* tests/devel/ (EXTRA_PROGRAMS): Updates for yesterday's
file removals and additions.
2004-10-10 Torbjorn Granlund <>
* mpn/ia64/copyi.asm: Tune function entry code.
* mpn/ia64/copyd.asm: Likewise.
* mpn/ia64/logops_n.asm: Tune function entry and feed-in code for speed
and size.
* mpn/ia64/aors_n.asm: Likewise.
* mpn/powerpc64/logops_n.asm: Correct cycles counts.
* mpn/powerpc64/mode64/aors_n.asm: Likewise.
* tests/devel/copy.c: Handle both MPN_COPY_INCR and MPN_COPY_DECR.
* tests/devel/logops_n.c: New file, handle all logical operations.
* tests/devel/anymul_1.c: New file, handle mpn_mul_1, mpn_addmul_1, and
* tests/devel/mul_1.c: Remove.
* tests/devel/addmul_1.c: Remove.
* tests/devel/submul_1.c: Remove.
* tests/devel/shift.c: New file, handle mpn_lshift and mpn_rshift.
* tests/devel/lshift.c: Remove.
* tests/devel/rshift.c: Remove.
* tests/devel/aors_n.c: New file, handle mpn_add_n and mpn_sub_n.
* tests/devel/add_n.c: Remove.
* tests/devel/sub_n.c: Remove.
2004-10-09 Torbjorn Granlund <>
* mpn/powerpc64/linux64.m4: Define DEF_OBJECT, END_OBJECT, and TOCREF.
* mpn/powerpc64/aix.m4: Likewise.
* mpn/powerpc64/mode64/invert_limb.asm: Use DEF_OBJECT, END_OBJECT, and
TOCREF for approx_tab.
* mpn/amd64/mul_1.asm: Add mpn_mul_1c entry point.
2004-10-08 Torbjorn Granlund <>
* mpn/powerpc64/copyi.asm: New file.
* mpn/powerpc64/copyd.asm: New file.
* Remove PPC MPN_COPY variants.
* gmp-impl.h: Likewise.
* mpn/powerpc64/logops_n.asm: New file.
* mpn/powerpc64/mode64/invert_limb.asm: New file.
2004-10-07 Torbjorn Granlund <>
* mpn/powerpc64/mode64/aors_n.asm: New file, optimized for POWER4 and
its derivatives.
* mpn/powerpc64/mode64/add_n.asm: Delete.
* mpn/powerpc64/mode64/sub_n.asm: Delete.
* configfsf.guess: Patch HP-UX code to accommodate HP compiler's new
inability to read from stdin.
* mpn/powerpc64/mode64/addsub_n.asm: Remove accidentally added file.
2004-10-02 Torbjorn Granlund <>
* mpn/amd64/README: Update for new developments, fix typos.
* mpn/amd64/mul_1.asm: Tweak addressing (3.25 => 3.0 cycles/limb).
* mpn/amd64/addmul_1.asm: Remove unreachable code block.
2004-09-30 Torbjorn Granlund <>
* mpn/amd64/addmul_1.asm: Rewrite, now 3.25 cycles/limb.
* mpn/ia64/addmul_1.asm: Slightly enhance cross-jumping for code
* mpn/ia64/mul_1.asm: Analogous changes.
2004-09-29 Torbjorn Granlund <>
* gmp-impl.h (x86 ULONG_PARITY): Work around GCC change of "q" register
2004-09-28 Torbjorn Granlund <>
* mpn/ia64/divrem_1.asm: Add cycle counts to loop.
* mpn/ia64/divrem_2.asm: New file.
2004-09-28 Paul Zimmermann <>
* mpn/generic/mul_fft.c (mpn_mul_fft): Fix a bug in the choice of the
recursive fft parameters.
2004-09-20 Torbjorn Granlund <>
* tests/misc.c (tests_rand_start): Default to strtoul for re-seeding.
* tests/mpz/t-mul.c (ref_mpn_mul): Fudge tmp allocation for toom3.
2004-09-19 Torbjorn Granlund <>
* tests/misc.c (tests_rand_start): Shift tv_usec for better seeding.
2004-09-18 Torbjorn Granlund <>
* tests/misc.c (tests_rand_start): Invoke fflush after printing seed.
* tests/mpz/t-mul.c (main): Check environment for GMP_CHECK_FFT, run
extra FFT tests if set.
(ref_mpn_mul): Use library code for kara and toom, but skewded so that
we never use the same algorithm that we're testing.
(mul_kara): Delete.
(debug_mp): Print just one line of large numbers.
(ref_mpn_mul): Rework usage of tp temporary space.
2004-09-15 Torbjorn Granlund <>
* mpn/ia64/mul_2.asm: For HAVE_ABI_32, convert vp.
* mpn/ia64/addmul_2.asm: Likewise.
2004-09-13 Torbjorn Granlund <>
* mpn/ia64/invert_limb.asm: Rewrite.
* mpn/ia64/logops_n.asm: Insert some more stops.
2004-09-12 Torbjorn Granlund <>
* mpn/ia64/gmp-mparam.h: Update.
* mpn/amd64/gmp-mparam.h: Update.
* mpn/ia64/sqr_diagonal.asm: Shave off a few cycles.
2004-09-11 Torbjorn Granlund <>
* mpn/ia64/mul_2.asm: New file.
* mpn/ia64/addmul_2.asm: New file.
* mpn/ia64/addmul_1.asm: Tune a cycle from prologue.
* mpn/ia64/lorrshift.asm: Insert stops after several branches.
* mpn/ia64/aorslsh1_n.asm: Likewise.
* mpn/ia64/rsh1aors_n.asm: Likewise.
* mpn/generic/sqr_basecase.c: In variant for HAVE_NATIVE_mpn_addmul_2,
accumulate carry also for when HAVE_NATIVE_mpn_addlsh1_n.
2004-09-07 Torbjorn Granlund <>
* mpn/ia64/submul_1.asm: Rewrite.
* mpn/ia64/addmul_1.asm: Format to placate HP-UX assembler.
* mpn/ia64/mul_1.asm: Likewise.
2004-09-02 Torbjorn Granlund <>
* mpn/ia64/mul_1.asm: Optimize feed-in code.
* mpn/ia64/addmul_1.asm: Rewrite feed-in code.
2004-08-29 Torbjorn Granlund <>
* tests/mpz/t-sizeinbase.c: Disable mpz_fake_bits and check_sample.
2004-07-16 Torbjorn Granlund <>
* mpn/ia64/addmul_1.asm: Format to placate HP-UX assembler.
2004-06-17 Kevin Ryde <>
* doc/gmp.texi: Use @. when sentence ends with a capital, for good
spacing in tex.
(Language Bindings): Add gmp-d, reported by Ben Hinkle. Update SWI
Prolog URL, reported by Jan Wielemaker.
2004-06-09 Torbjorn Granlund <>
* Handle --enable-fat. Use that to enable x86 fat
builds, remove magic meaning of i386-*-*.
2004-06-03 Kevin Ryde <>
* gmp-impl.h (memset): Use a local char* pointer, in case parameter is
something else (eg. tune/common.c). Reported by Emmanuel Thomé.
2004-06-01 Kevin Ryde <>
* config.guess (i?86-*-*): Avoid "Illegal instruction" message which
goes to stdout on 80386 freebsd4.9.
2004-05-23 Niels Möller <>
* mpn/generic/gcdext.c (gcdext_1_u): New function.
(mpn_gcdext): Use it.
2004-05-23 Torbjorn Granlund <>
* mpn/generic/gcdext.c (gcdext_1_odd): Use masking to avoid jumps.
2004-05-22 Torbjorn Granlund <>
* mpn/x86/pentium4/sse2/addmul_1.asm: Add Prescott cycle numbers.
* mpn/amd64/divrem_1.asm: Shave a cycle from fraction development code.
* mpn/powerpc32/lshift.asm: Add more cycle numbers.
* mpn/powerpc32/rshift.asm: Likewise.
* mpn/ia64/addmul_1.asm: Reformat.
2004-05-21 Torbjorn Granlund <>
* gmp-impl.h (mpn_mullow_n, mpn_mullow_basecase): Declare.
* tune/ Compile gcdext.c.
* gmp-impl.h (GET_STR_THRESHOLD_LIMIT): Lower outrageous value to 150.
(GCDEXT_SCHOENHAGE_THRESHOLD): Set reasonable default. Override when
* tune/tuneup.c (gcdext_schoenhage_threshold): New variable.
(gcdext_threshold): Remove variable.
(tune_gcd_schoenhage): Lower step_factor to 0.1.
(tune_gcdext_schoenhage): New function, based on tune_gcd_schoenhage.
(tune_gcdext): Remove function.
(all): Corresponding changes.
2004-05-21 Niels Möller <>
* mpn/generic/gcdext.c: Complete rewrite. Uses fast Lehmer code for
small operands, and Schoenhage code for large operands.
* tune/speed.h (SPEED_ROUTINE_MPN_GCD_CALL): Ensure first operand is
not smaller than 2nd operand.
2004-05-17 Kevin Ryde <>
* (mpz_get_ui): Use #if instead of plain if, and for nails
use ?: same as normal case, to avoid warnings from Borland C++ 6.0.
Reported by delta trinity.
2004-05-15 Kevin Ryde <>
* tune/time.c (getrusage_backwards_p): New function
(speed_time_init): Use it to exclude broken netbsd1.4.1 getrusage.
* (m68*-*-netbsd1.4*): Remove code pretending getrusage
doesn't exist.
* tune/README (NetBSD 1.4.1 m68k): Update notes.
* (mips*-*-* ABI=n32): Remove gcc_n32_ldflags and
cc_n32_ldflags, libtool knows to put the linker in n32 mode.
2004-05-15 Torbjorn Granlund <>
* config.guess (powerpc*-*-*): Add more processor types to mfpvr code.
* Generalize powerpc subtype matching code.
* mpz/fac_ui.c: Misc cleanups, spelling corrections.
2004-05-14 Kevin Ryde <>
* mpf/sub.c: When one operand cancels high limbs of the other, strip
high zeros on the balance before truncating to destination precision.
Truncating first loses accuracy and can lead to a result 0 despite
operands being not equal. Reported by John Abbott.
Also, ensure exponent is zero when result is zero, for instance if
operands are exactly equal.
* tests/mpf/t-sub.c (check_data): New function, exercising these.
2004-05-12 Kevin Ryde <>
* (AC_PROG_RANLIB): New macro, supposedly required by
automake, though it doesn't complain.
* demos/expr/ (ARFLAGS): Add a default setting, to
workaround an automake bug.
2004-05-10 Kevin Ryde <>
* */, install-sh, aclocal.m4: Update to automake 1.8.4.
* doc/gmp.texi (Demonstration Programs): Add a remark about expression
evaluation in the main gmp library.
* demos/expr/exprfa.c (mpf_expr_a): Correction to mpX_init, use
mpf_init2 to follow requested precision.
* demos/expr/exprza.c, demos/expr/exprqa.c: Use wrappers for mpX_init,
to make parameters match.
* demos/expr/run-expr.c: Don't use getopt, to avoid needing configury
for optarg declaration. Remove TRY macro, rename foo and bar to var_a
and var_b, for clarity.
* demos/expr/expr-impl.h: Don't use expr-config.h.
* (demos/expr/expr-config.h): Remove.
* demos/expr/ Remove file.
2004-05-08 Kevin Ryde <>
* doc/configuration (Configure): Update for current automake not
copying acinclude.m4 into aclocal.m4.
*,, doc/gmp.texi, doc/configuration,
tests/cxx/, demos/expr/, demos/expr/README,
demos/expr/expr.c, demos/expr/expr.h, demos/expr/,
demos/expr/expr-impl.h, demos/expr/run-expr.c, demos/expr/t-expr.c:
MPFR now published separately, remove various bits.
* mpfr/*, tests/cxx/, demos/expr/exprfr.c,
demos/expr/exprfra.c: Remove.
2004-05-07 Kevin Ryde <>
* tests/cxx/ (TESTS_ENVIRONMENT): Amend c++ shared library
path hack, on k62-unknown-dragonfly1.0 /usr/bin/make runs its commands
"set -e", so we need an "|| true" in case there's nothing to copy (for
instance in a static build).
2004-05-06 Kevin Ryde <>
* mpn/alpha/mode1o.c: Remove, in favour of ...
* mpn/alpha/mode1o.asm: New file.
* mpn/alpha/alpha-defs.m4 (bwx_available_p): New macro.
* tune/amd64.asm: Save rbx in r10 rather than on the stack.
* (x86_64-*-*): Try also "-march=k8 -mno-sse2", in case
we're in ABI=32 on an old OS not supporting xmm regs.
(GMP_GCC_PENTIUM4_SSE2, GMP_OS_X86_XMM): Run these tests under
-march=k8 too, and not under ABI=64.
* doc/gmp.texi (Converting Integers): For mpz_get_d, note truncation
and overflows. For mpz_get_d_2exp note truncation, note result if
OP==0, and cross reference libc frexp.
(Rational Conversions): For mpq_get_d, note truncation and overflows.
(Converting Floats): For mpf_get_d, note truncation and overflows.
For mpf_get_d_2exp, note truncation, note result if OP==0.
(Assembler Code Organisation): Note nails subdirectories.
Clarification of get_d_2exp OP==0 reported by Sylvain Pion.
2004-05-05 Torbjorn Granlund <>
* mpn/generic/mullow_n.c, mpn/generic/mullow_basecase.c: New files
(mainly by Niels Möller).
*, mpn/ Add them.
* tune/ Compile mullow_n.c.
* tune/common.c (speed_mpn_mullow_n, speed_mpn_mullow_basecase):
New functions.
* tune/speed.c (routine): Add entries for mpn_mullow_n and
* tune/tuneup.c (tune_mullow): New function.
* gmp-impl.h (invert_limb): Compute branch-freely.
2004-05-02 Kevin Ryde <>
* mpn/amd64/mode1o.asm: Use movabsq to support large model non-PIC.
Use 32-bit insns to save code bytes, and to save a couple of cycles on
the initial setup multiplies.
2004-05-01 Kevin Ryde <>
* doc/gmp.texi (References): Update gcc online docs url to
* (mips*-*-irix[6789]*): Correction to m4 quoting of this
pattern. (Believe the mips64*-*-* part also used picks up all current
irix6 tuples anyway.) Reported by Rainer Orth.
2004-04-30 Kevin Ryde <>
* acinclude.m4 (GMP_PROG_CC_X86_GOT_EAX_EMITTED,
GMP_ASM_X86_GOT_EAX_OK): New macros.
(GMP_PROG_CC_WORKS): Use them to detect an old gas bug tickled by
recent gcc. Reported by David Newman.
* doc/gmp.texi (Reentrancy): Note also gmp_randinit_default as an
alternative to gmp_randinit.
2004-04-29 Torbjorn Granlund <>
* configfsf.guess: Update to 2004-03-12.
* configfsf.sub: Likewise.
2004-04-27 Torbjorn Granlund <>
* mpz/rrandomb.c (gmp_rrandomb): Rework to avoid extra limb allocation
and to generate even numbers.
* mpn/generic/random2.c (gmp_rrandomb): Likewise.
2004-04-25 Kevin Ryde <>
* gmp-impl.h (FORCE_DOUBLE): Don't use an asm with a match constraint
on a memory output, apparently not supported and provokes a warning
from gcc 3.4.
2004-04-24 Kevin Ryde <>
* longlong.h (count_leading_zeros_gcc_clz,
count_trailing_zeros_gcc_ctz): New macros.
(count_leading_zeros, count_trailing_zeros) [x86]: Use them on gcc
* (x86-*-* gcc_cflags_cpu): Give a -mtune at the start of
each option list, for use by gcc 3.4 to avoid deprecation warnings
about -mcpu.
* mpz/aorsmul.c, mpz/aorsmul_i.c, mpz/cfdiv_q_2exp.c,
mpz/cfdiv_r_2exp.c, mpq/aors.c, mpf/ceilfloor.c: Give REGPARM_ATTR()
on function definition too, as demanded by gcc 3.4.
2004-04-22 Kevin Ryde <>
* tests/rand/t-lc2exp.c (check_bigc1): New test.
* doc/fdl.texi: Tweak @appendixsubsec -> @appendixsec to match our
preference for this in an @appendix, and because texi2pdf doesn't
support @appendixsubsec directly within an @appendix.
2004-04-20 Kevin Ryde <>
* doc/texinfo.tex: Update to 2004-04-07.08 from texinfo 4.7.
* doc/gmp.texi, mpfr/mpfr.texi (@copying): Don't put a line break in
@ref within @copying, recent texinfo.tex doesn't like that.
* demos/perl/GMP.xs (static_functable): Treat cygwin the same as mingw
* */, install-sh: Update to automake 1.8.3.
*, aclocal.m4, configure: Update to libtool 1.5.6.
* gmp-impl.h (LIMB_HIGHBIT_TO_MASK): Use a compile-time constant
expression, rather than a configure test.
* acinclude.m4, (GMP_C_RIGHT_SHIFT): Remove, no longer
* tests/t-hightomask.c: New file.
* tests/ (check_PROGRAMS): Add it.
* macos/configure (parse_top_configure): Look for PACKAGE_NAME and
PACKAGE_VERSION now used by autoconf.
(what_objects): Only demand 9 object files, as for instance occurs in
the scanf directory.
(asm files): Transform labels L(foo) -> Lfoo. Take func name from
PROLOGUE to support empty "EPILOGUE()". Recognise and subsitute
register name "define()"s.
* macos/ (CmnObjs): Add tal-notreent.o.
2004-04-19 Torbjorn Granlund <>
* tune/speed.h (SPEED_ROUTINE_MPN_ROOTREM): New #define.
(speed_mpn_rootrem): Declare.
* tune/common.c (speed_mpn_rootrem): New function.
* tune/speed.c (routine): Add entry for mpn_rootrem.
2004-04-16 Kevin Ryde <>
* doc/fdl.texi: Update from FSF, just fixing a couple of typos.
* macos/configure, macos/ Add printf and scanf directories.
* tests/mpz/t-gcd.c (check_data): New function, exercising K6
gcd_finda bug.
2004-04-14 Kevin Ryde <>
* doc/gmp.texi (Reentrancy, Random State Initialization): Note
gmp_randinit use of gmp_errno is not thread safe. Reported by Vincent
* doc/gmp.texi (Random State Initialization): Add index entries for
gmp_errno and constants.
* mpn/m68k/README: Update _SHORT_LIMB -> __GMP_SHORT_LIMB.
* (--enable-mpbsd): Typo Berkley -> Berkeley in help msg.
2004-04-12 Kevin Ryde <>
* demos/perl/GMP.xs (static_functable): New macro, use it for all
function tables, to support mingw DLL builds.
* demos/perl/INSTALL (NOTES FOR PARTICULAR SYSTEMS): Remove note on
DLLs, should be ok now.
* demos/perl/ Print the module and library versions in use.
* demos/perl/, Makefile.PL (VERSION): Set to '2.00'.
* demos/perl/ (COPYRIGHT): New in the doc section.
* Note 4.1.3 libtool versioning info, and REVISION policy.
* tal-debug.c: Add <stdlib.h> for abort.
2004-04-07 Torbjorn Granlund <>
* tests/refmpf.c (refmpf_add_ulp): Adjust exponent when needed.
* mpn/generic/random2.c: Rewrite (clone mpz/rrandomb.c).
2004-04-07 Kevin Ryde <>
* mpn/x86/k6/gcd_finda.asm: Correction jbe -> jb in initial setups.
Zero flag is wrong here, it relects only the high limb of the compare,
leading to n1>=n2 not satisfied and wrong results. cp[1]==0x7FFFFFFF
with cp[0]>=0x80000001 provokes this.
* doc/gmp.texi (BSD Compatible Functions): Note "pow" name clash under
the pow function description too.
(Language Bindings): Add XEmacs (betas at this stage). Reported by
Jerry James.
* tests/refmpn.c (refmpn_mod2): Correction to ASSERTs, r==a is allowed.
* gen-psqr.c (generate_mod): Cast mpz_invert_ui_2exp args, for K&R.
* gen-bases.c, gen-fib.c, gen-psqr.c: For mpz_out_str, use stdout
instead of 0, in case a K&R treats int and FILE* params differently.
2004-04-04 Kevin Ryde <>
* gmp-impl.h (BSWAP_LIMB) [amd64]: New macro.
(FORCE_DOUBLE): Use this for amd64 too.
* tests/amd64check.c, tests/amd64call.asm: New files, derived in part
from x86check.c and x86call.asm.
* tests/ (EXTRA_libtests_la_SOURCES): Add them.
* (x86_64-*-* ABI=64): Use them.
2004-04-03 Kevin Ryde <>
* mpn/amd64/mode1o.asm: New file.
* mpn/amd64/amd64-defs.m4 (ASSERT): New macro.
* mpn/x86/k7/mmx/divrem_1.asm, mpn/x86/pentium4/sse2/divrem_1.asm: Add
note on how "dr" part of algorithm is handled.
* mpn/x86/k7/dive_1.asm, mpn/x86/k7/mod_34lsub1.asm,
mpn/x86/k7/mode1o.asm: Note Hammer (32-bit mode) speeds.
2004-03-31 Kevin Ryde <>
* doc/gmp.texi (Language Bindings): Add GOO, MLGMP and Numerix.
* mpf/mul_2exp.c, mpf/div_2exp.c: Rate u==0 as UNLIKELY.
2004-03-28 Torbjorn Granlund <>
* mpn/amd64/divrem_1.asm: Trim a few cycles.
2004-03-27 Torbjorn Granlund <>
* mpn/amd64/sublsh1_n.asm: Fix typo.
* mpn/generic/divrem_1.c: Fix typo.
* mpn/generic/sqr_basecase.c: Fix typo.
* mpn/amd64/divrem_1.asm: New file.
2004-03-20 Kevin Ryde <>
* longlong.h (power, powerpc): Add comments on how we select this code.
* (mpz_get_ui): Use ?: instead of mask style, gcc treats the
two identically but ?: is a bit clearer.
* insert-dbl.c: Remove file, no longer used, scaling is now integrated
in mpn_get_d.
* (libgmp_la_SOURCES): Remove insert-dbl.c.
* gmp-impl.h (__gmp_scale2): Remove prototype.
2004-03-17 Kevin Ryde <>
* mpn/x86/fat/fat.c (__gmpn_cpuvec_init, fake_cpuid_table): Add x86_64.
* mpq/get_d.c: Use mpn_tdiv_qr, demand den>0 per canonical form.
2004-03-16 Torbjorn Granlund <>
* mpn/generic/sqr_basecase.c: Add versions using mpn_addmul_2 and
2004-03-14 Kevin Ryde <>
* mpf/mul_ui.c: Incorporate carry from low limbs, for exactness.
* tests/mpf/t-mul_ui.c: New file.
* tests/mpf/ (check_PROGRAMS): Add it.
* mpf/div.c: Use mpn_tdiv_qr. Use just one TMP_ALLOC. Use full
divisor, since truncating can lose accuracy.
* tests/mpf/t-div.c: New file.
* tests/mpf/ (check_PROGRAMS): Add it.
* tests/mpf/t-set_q.c, tests/mpf/t-ui_div.c (check_various): Amend
bogus 99/4 test.
* tests/mpf/t-ui_div.c (check_rand): Exercise r==v overlap.
* tests/refmpf.c, tests/tests.h (refmpf_set_overlap): New function.
* mpf/cmp_si.c [nails]: Correction, cast vval in exp comparisons, for
when vval=-0x800..00 and limb==longlong.
* mpf/cmp_si.c [nails]: Correction, return usign instead of 1 when
uexp==2 but value bigger than an mp_limb_t.
* tests/mpf/t-cmp_si.c (check_data): Add test cases.
* tests/trace.c (mpf_trace): Use ABS(mp_trace_base) to allow for
negative bases used for upper case hex in integer traces.
2004-03-12 Torbjorn Granlund <>
* mpn/generic/sb_divrem_mn.c: Correct header comment.
2004-03-11 Kevin Ryde <>
* aclocal.m4, configure, Downgrade to libtool 1.5, version
1.5.2 doesn't remove .libs/*.a files when rebuilding, which is bad for
development when changing contents or with duplicate named files like
we have.
Revert this, ie restore AR_FLAGS=cq:
* acinclude.m4 (GMP_PROG_AR): Remove AR_FLAGS=cq, libtool 1.5.2 now
does this itself on detecting duplicate object filenames in piecewise
linking mode.
* randbui.c, randmui.c [longlong+nails]: Correction to conditionals
for second limb.
* mpz/aors_ui.h, mpz/cdiv_q_ui.c, mpz/cdiv_qr_ui.c, mpz/cdiv_r_ui.c,
mpz/cdiv_ui.c, mpz/fdiv_q_ui.c, mpz/fdiv_qr_ui.c, mpz/fdiv_r_ui.c,
mpz/fdiv_ui.c, mpz/gcd_ui.c, mpz/iset_ui.c, mpz/lcm_ui.c,
mpz/set_ui.c, mpz/tdiv_q_ui.c, mpz/tdiv_qr_ui.c, mpz/tdiv_r_ui.c,
mpz/tdiv_ui.c, mpz/ui_sub.c, mpf/div_ui.c, mpf/mul_ui.c
[longlong+nails]: Amend #if to avoid warnings about shift amount.
2004-03-07 Kevin Ryde <>
* mpf/reldiff.c: Use rprec+ysize limbs for d, to ensure accurate
result. Inline mpf_abs(d,d) and mpf_cmp_ui(x,0), and rate the latter
* mpf/ui_div.c: Use mpn_tdiv_qr. Use just one TMP_ALLOC. Use full
divisor, since truncating can lose accuracy.
* tests/mpf/t-ui_div.c: New file.
* tests/mpf/ (check_PROGRAMS): Add it.
* mpf/set_q.c: Expand TMP_ALLOC_LIMBS_2, to make conditional clearer
and avoid 1 limb alloc when not wanted.
* gmp-impl.h (WANT_TMP_DEBUG): Define to 0 if not defined.
(TMP_ALLOC_LIMBS_2): Use "if" within macro rather than "#if", for less
preprocessor conditionals.
* mpf/mul_2exp.c, mpf/div_2exp.c: Add some comments.
* tests/refmpn.c (refmpn_sb_divrem_mn, refmpn_tdiv_qr): Nailify.
2004-03-04 Kevin Ryde <>
* gen-psqr.c (print): Add CNST_LIMB in PERFSQR_MOD_TEST, for benefit
of K&R.
* tests/mpn/t-perfsqr.c (PERFSQR_MOD_1): Use CNST_LIMB for K&R.
* doc/configuration (Configure): Remove mkinstalldirs, no longer used.
* acinclude.m4 (GMP_PROG_AR): Remove AR_FLAGS=cq, libtool 1.5.2 now
does this itself on detecting duplicate object filenames in piecewise
linking mode.
* (hppa2.0*-*-*): Test sizeof(long) == 4 or 8 to verify
ABI=2.0n versus ABI=2.0w. In particular this lets CC=cc_bundled
correctly fall back to ABI=2.0n (we don't automatically add CC=+DD64
to that compiler, currently).
* doc/gmp.texi (Reentrancy): Note C++ mpf_class constructors using
global default precision.
(Random State Miscellaneous): Describe gmp_urandomb_ui as giving N
(C++ Interface Floats): Describe operator= copying the value, not the
precision, and what this can mean about copy constructor versus
default constructor plus assignment.
* mpf/set_q.c: Use mpn_tdiv_qr rather than mpn_divrem, so no shifting.
Don't truncate the divisor, it can make the result inaccurate.
* tests/mpf/t-set_q.c: New file.
* tests/mpf/ (check_PROGRAMS): Add it.
* mpf/set.c: Use MPN_COPY_INCR, in case r==u and ABSIZ(u) > PREC(r)+1.
No actual bug here, because MPN_COPY has thusfar been an alias for
MPN_COPY_INCR, only an ASSERT failure.
* tests/mpf/t-set.c: New file.
* tests/mpf/ (check_PROGRAMS): Add it.
* mpf/set.c, mpf/iset.c: Do MPN_COPY last, for possible tail call.
* mpf/set_d.c: Rate d==0 as UNLIKELY. Store size before extract call,
to shorten lifespan of "negative".
* mpf/init.c, mpf/init2.c, mpf/iset_d.c, mpf/iset_si.c,
mpf/iset_str.c, mpf/iset_ui.c: Store prec before alloc call, for one
less live quantity across that call.
* mpf/init.c, mpf/init2.c, mpf/iset_str.c: Store size and exp before
alloc call, to overlap with other operations.
* tests/refmpf.c, tests/tests.h (refmpf_fill, refmpf_normalize,
refmpf_validate, refmpf_validate_division): New functions.
* tests/refmpn.c, tests/tests.h (refmpn_copy_extend,
refmpn_lshift_or_copy_any, refmpn_rshift_or_copy_any): New functions.
* tal-debug.c: Add <string.h> for strcmp.
* tests/cxx/ (check_mpz, check_mpq, check_mpf): Use size_t
for loop index, to quieten g++ warning.
2004-03-02 Kevin Ryde <>
* tests/mpn/t-hgcd.c: Use __GMP_PROTO on prototypes.
2004-03-01 Torbjorn Granlund <>
With Karl Hasselström:
* mpn/generic/dc_divrem_n.c (mpn_dc_div_2_by_1): New function, with
meat from old mpn_dc_divrem_n. Accept scratch parameter. Rewrite to
avoid a recursive call.
(mpn_dc_div_3_by_2): New function, with meat from old
mpn_dc_div_3_halves_by_2. Accept scratch parameter.
(mpn_dc_divrem_n): Now just allocate scratch space and call new
2004-02-29 Kevin Ryde <>
* longlong.h (count_leading_zeros) [alpha gcc]: New version, inlining
mpn/alpha/cntlz.asm cmpbge technique.
* aclocal.m4, configure, install-sh, missing,,
*/ Update to automake 1.8.2 and libtool 1.5.2.
* doc/gmp.texi (C++ Interface Integers): Note / and % rounding follows
C99 / and %.
(Exact Remainder): Index entries for divisibility testing algorithm.
* tune/time.c (speed_endtime): Return 0.0 for negative time measured.
Revise usage comments for clarity.
* tune/common.c (speed_measure): Recognise speed_endtime 0.0 for
failed measurement.
* tests/mpn/t-get_d.c (check_rand): Correction to nhigh_mask setup.
2004-02-27 Torbjorn Granlund <>
* tune/tuneup.c (tune_dc, tune_set_str): Up param.step_factor.
* tests/mpz/t-gcd.c: Decrease # of tests to 50.
2004-02-27 Kevin Ryde <>
* tests/devel/try.c: Add a comment that this is not for Cray systems.
* mpf/set_q.c: Don't support den(q)<0, demand canonical form in the
usual way.
2004-02-24 Torbjorn Granlund <>
From Kevin:
* mpn/generic/mul_fft.c (mpn_fft_add_modF): Loop until normalization
criterion met.
2004-02-22 Kevin Ryde <>
Remove files that might look like compiler output, so our "||"
alternatives are not fooled.
* acinclude.m4 (GMP_PROG_CC_WORKS): Add test for lshift_com code
mis-compiled by certain IA-64 HP cc at +O3.
* gmp-impl.h (USE_LEADING_REGPARM): Disable under prof or gprof, for
the benefit of freebsd where .mcount clobbers registers. Spotted by
2004-02-21 Kevin Ryde <>
* (sparc64-*-*bsd*): Amend -m32 setup for ABI=32, so it's
not used in ABI=64 on the BSD systems.
2004-02-18 Niels Möller <>
* tests/mpz/t-gcd.c (gcdext_valid_p): New function.
(ref_mpz_gcd): Deleted function.
(one_test): Rearranged to call mpz_gcdext first, so that the
returned value can be validated.
(main): Don't use ref_mpz_gcd.
2004-02-18 Torbjorn Granlund <>
* gmp-impl.h (MPN_TOOM3_MAX_N): Move to !WANT_FFT section.
* tests/mpz/t-mul.c: Exclude special huge operands unless WANT_FFT.
* mpz/rrandomb.c (gmp_rrandomb): Rewrite.
* mpn/generic/mul_n.c (mpn_toom3_sqr_n): Remove write-only variable c5.
2004-02-18 Kevin Ryde <>
* mpf/iset_si.c, mpf/iset_ui.c, mpf/set_si.c, mpf/set_ui.c [nails]:
Always store second limb, to avoid a conditional.
* tests/mpf/t-get_ui.c: New file.
* tests/mpf/ (check_PROGRAMS): Add it.
* tests/mpf/t-get_si.c (check_limbdata): Further tests.
* gmp-impl.h (MP_EXP_T_MAX, MP_EXP_T_MIN): New defines.
* mpf/get_ui.c, mpf/get_si.c: Remove size==0 test, it's covered by
other conditions. Attempt greater clarity by expressing conditions as
based on available data range.
* mpf/get_si.c [nails]: Correction, don't bail on exp > abs_size,
since may still have second limb above radix point available.
* mpf/get_ui.c: Nailify.
2004-02-16 Kevin Ryde <>
* mpz/scan0.c, mpz/scan1.c: Use count_trailing_zeros, instead of
count_leading_zeros on limb&-limb.
* mpf/sqrt.c: Use "/ 2" for exp, avoiding C undefined behaviour on
">>" of negatives. Correction to comment, exp is rounded upwards.
SIZ(r) always prec now, no need for tsize expression. Store EXP(r)
and SIZ(r) where calculated to reduce variable lifespans. Make tsize
mp_size_t not mp_exp_t, though of course those are currently the same.
GMP_ERROR_UNUSED_ERROR): Remove, never used or documented, and we
don't want to use globals for communicating error information.
* mpz/gcd_ui.c [nails]: Correction, actually return a value.
* mpn/generic/addmul_1.c, mpn/generic/submul_1.c [nails==1]: Add code.
2004-02-15 Kevin Ryde <>
* tests/mpz/t-jac.c (check_data): Remove unnecessary variable
2004-02-14 Torbjorn Granlund <>
* mpn/ia64/aors_n.asm: Break a group with a RAW conflict.
2004-02-14 Kevin Ryde <>
* acinclude.m4 (GMP_C_RIGHT_SHIFT): Note that it's "long"s which we're
concerned about.
* mpn/generic/mul_n.c: Add some remarks about toom3 high zero
* mpn/generic/scan0.c, mpn/generic/scan1.c: Remove design issue
remarks. What to do about going outside `up' space is a problem, but
anything to address it would be an incompatible change.
2004-02-12 Torbjorn Granlund <>
* tests/mpn/t-hgcd.c: Remove unused variables.
* mpn/ia64/hamdist.asm: Remove bundling incompatible with HP-UX
assember. Misc HP-UX changes.
* mpn/ia64/gcd_1.asm: Add some syntax to placid the HP-UX assembler.
2004-02-11 Kevin Ryde <>
* longlong.h (power, powerpc): Use HAVE_HOST_CPU_FAMILY_power and
HAVE_HOST_CPU_FAMILY_powerpc rather than various cpp defines.
* gmp-impl.h: Add remarks about limits.h and Cray etc.
* mpn/ia64/mul_1.asm: Don't put .pred directives on labelled lines,
hpux 11.23 assembler doesn't like that.
* mpn/ia64/README: Add a note on this.
* dumbmp.c (mpz_mul): Set ALLOC(r) for new data block used. Reported
by Jason Moxham.
* mpn/pa32/README, mpn/pa64/README (REFERENCES): New sections.
2004-02-10 Torbjorn Granlund <>
* tests/mpz/t-gcd.c: Decrease # of tests run.
* mpn/*/gmp-mparam.h: Add HGCD values, update TOOM values.
2004-02-01 Torbjorn Granlund <>
From Kevin:
* config.guess: Recognize AMD's hammer processors, return x86_64.
2004-01-31 Niels Möller <>
* mpn/generic/hgcd.c (mpn_cmp_sum3): Declare static.
2004-01-25 Niels Möller <>
* tests/mpn/ (check_PROGRAMS): Add t-hgcd.
* mpn/generic/hgcd.c (hgcd_jebelean): Simplify, use mpn_cmp_sum3.
(mpn_cmp_sum3): New function.
(mpn_diff_smaller_p): Remove.
(hgcd_final, hgcd_jebelean, hgcd_small_1, hgcd_small_2, euclid_step):
Remove tp,talloc arguments. Callers changed.
2004-01-25 Torbjorn Granlund <>
* tune/tuneup.c (all): Reenable calls of tune_gcd_schoenhage and
* mpn/generic/gcd.c: Reenable Schoenhage code.
With Niels Möller:
* mpn/generic/hgcd.c: Add const and inline to several functions.
(qstack_push_start qstack_push_end qstack_push_quotient): Remove.
(euclid_step): Insert removed functions here.
(hgcd_adjust): Simplify, don't handle d != 1.
(qstack_adjust): Corresponding changes.
(mpn_hgcd2_lehmer_step): Remove redundant tests for bh against zero.
(hgcd_start_row_p): Tweak.
(hgcd_final): Shorten life of ralloc.
2004-01-24 Kevin Ryde <>
* tests/mpf/t-sqrt.c (check_rand1): Further diagnostic printouts.
* mpn/generic/sqrtrem.c (mpn_sqrtrem): Add ASSERT_MPN.
(mpn_dc_sqrtrem): Add casts for K&R.
* mpf/sqrt_ui.c: Nailify.
* mpf/set_z.c: Do MPN_COPY last, for possible tail call.
* doc/gmp.texi (Miscellaneous Float Functions): For mpf_random2, note
exponent is in limbs.
* mpn/ia64/README: Add remark about concentrating on itanium-2.
2004-01-22 Kevin Ryde <>
* mpf/sqrt.c: Change tsize calculation to get prec limbs result
always, previously got prec+1 when exp was odd.
* tests/mpf/t-sqrt.c (check_rand1): New function, code from main.
(check_rand2): New function.
* mpf/sqrt_ui.c: Change rsize calculation to get prec limbs result,
previously got prec+1.
* tests/mpf/t-sqrt_ui.c: New file.
* tests/mpf/ (check_PROGRAMS): Add it.
* tests/refmpf.c, tests/tests.h (refmpf_add_ulp,
refmpf_set_prec_limbs): New functions.
* mpz/get_d_2exp.c, mpf/get_d_2exp.c: Remove x86+m68k force to double,
mpn_get_d now does this. Remove res==1.0 check for round upwards,
mpn_get_d now rounds towards zero. Move exp store to make mpn_get_d a
tail call.
* (x86-*-*): Use ABI=32 rather than ABI=standard.
Use gcc -m32 when available, to force mode on bi-arch amd64 gcc.
*, acinclude.m4 (x86_64-*-*): Merge into plain x86 setups
as ABI=64. Support ABI=32, using athlon code. Use gcc -mcpu=k8,
(amd64-*-*): Remove pattern, config.sub only gives x86_64.
* doc/gmp.texi (ABI and ISA): Add x86_64 dual ABIs.
* mpn/amd64/README: Add reference to ABI spec.
2004-01-17 Niels Möller <>
* mpn/generic/hgcd.c (hgcd_adjust): Backed out mpn_addlsh1_n
change for now.
* mpn/generic/hgcd.c (hgcd_adjust): Fixed calls of mpn_addlsh1_n.
2004-01-17 Kevin Ryde <>
* tune/README: Remove open/mpn versions of toom3, no longer exist.
* tune/powerpc64.asm: Remove unused L(again).
* tune/time.c (mftb): Note single mftb possible for powerpc64.
* mpn/generic/mode1o.c: Use "c<s" to do underflow detection in last
step, for better parallelism.
* mpn/generic/get_d.c: Preserve comments about hppa fcnv,udw,dbl from
previous mpz_get_d code.
* tune/freq.c: Add some comments about systems not covered.
* (_GMP_H_HAVE_FILE): Add _MSL_STDIO_H for Metrowerks.
Reported by Tomas Zahradnicky.
2004-01-16 Niels Möller <>
* mpn/generic/hgcd.c (mpn_diff_smaller_p): Use MPN_DECR_U.
(hgcd_adjust): Use mpn_addlsh1_n when available.
2004-01-16 Kevin Ryde <>
* (powerpc64-*-linux*): Try gcc64. Try -m64 with
"cflags_maybe" to get it used in all probing. Add sizeof-long-8 test
to check the mode is right if -m64 is not applicable.
2004-01-15 Kevin Ryde <>
* (--with-readline=detect): Check for readline/readline.h
and readline/history.h. Report result of detection.
2004-01-14 Niels Möller <>
* tune/speed.c (routine): Disabled speed_mpn_hgcd_lehmer.
* tune/common.c (speed_mpn_hgcd_lehmer): Disabled function.
* mpn/generic/hgcd.c (mpn_hgcd_lehmer_itch, mpn_hgcd_lehmer)
(mpn_hgcd_equal): Deleted functions.
* mpn/generic/gcd.c (hgcd_start_row_p): Deleted function.
(gcd_schoenhage): Deleted assertion code using mpn_hgcd_lehmer.
* mpn/generic/hgcd.c (hgcd_final): Fixed ASSERT typos.
(mpn_hgcd): To use Lehmer's algorithm, call hgcd_final directly,
not mpn_hgcd_lehmer.
* mpn/generic/gcd.c (gcd_schoenhage): Updated for changes to
mpn_hgcd and mpn_hgcd_fix. (Schoenhage code is still disabled).