Search Results - "Gustavson, F.G."
-
1
Series approximation methods for divide and square root in the Power3/sup TM/ processor
Published in Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336) (1999)“…The Power3 processor is a 64-bit implementation of the PowerPC/sup TM/ architecture and is the successor to the Power2/sup TM/ processor for workstations and…”
Get full text
Conference Proceeding -
2
A high-performance SIMD floating point unit for BlueGene/L: architecture, compilation, and algorithm design
Published in Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004 (2004)“…We describe the design, implementation, and evaluation of a dual-issue SIMD-like extension of the PowerPC 440 floating-point unit (FPU) core. This extended FPU…”
Get full text
Conference Proceeding -
3
Implementing Linear Algebra Algorithms for Dense Matrices on a Vector Pipeline Machine
Published in SIAM review (01-01-1984)“…This paper examines common implementations of linear algebra algorithms, such as matrix-vector multiplication, matrix-matrix multiplication and the solution of…”
Get full text
Journal Article -
4
A new efficient algorithm for solving differential-algebraic systems using implicit backward differentiation formulas
Published in Proceedings of the IEEE (01-01-1972)“…The backward differentiation formulas (BDF), of order 1 up to 6 are described as they are applied to a system of differential algebraic equations. The BDF…”
Get full text
Journal Article -
5
A high performance parallel algorithm for 1-D FFT
Published in Proceedings of Supercomputing '94 (1994)“…Proposes a parallel high-performance fast Fourier transform (FFT) algorithm based on a multi-dimensional formulation. We use this to solve a commonly…”
Get full text
Conference Proceeding -
6
A high performance algorithm using pre-processing for the sparse matrix-vector multiplication
Published in Proceedings Supercomputing '92 (1992)“…The authors propose a feature-extraction-based algorithm (FEBA) for sparse matrix-vector multiplication. The key idea of FEBA is to exploit any regular…”
Get full text
Conference Proceeding -
7
An efficient parallel algorithm for the 3-D FFT NAS parallel benchmark
Published in Proceedings of IEEE Scalable High Performance Computing Conference (1994)“…We propose an efficient algorithm to implement the 3D NAS FFT benchmark. The proposed algorithm overlaps the communication with the computation. On parallel…”
Get full text
Conference Proceeding