COMPS260F Mini Project - Group2

Graphics processing unit architecture

Abstract

Graphic Processing Units (GPU) is an important processor of a computer. It is a micro processor and is also one of the main processing units in the architecture of computing graphics display. Since the architecture of GPU is built with parallel concept and vector processing units, therefore, apart from dealing with graphics display, it can also used in a very efficient way for researching or processing the algorithms which involves huge amount of mathematical calculation, such as floating point and large scale of matrix or vector operations.

Roles of GPU

The initial reason for inventing the GPU is to share the burden of the main system resources (such as the Central Processing Unit (CPU), Main Memory, and the System Bus), so the graphical operation and I/O requests are then process effectively. This is done by separating dedicated graphics resources including a graphics processor and memory.

In addition, another goal of GPU is to enable the representation of 3D world on the computer. Therefore, some of the GPUs are customized and so providing extra computational power to perform the 3D tasks.

Compare to CPU, GPU is tailored for highly parallel operation which has many parallel execution units. Besides, GPU have more advanced memory interface and much deeper pipelines which provides significantly faster data transmission.

The classic Architecture of GPU - Nvidia 6800 Series

An Nvidia GeForce 6800 Series architecture will be introduced for more easier to know about the basis of GPU architecture concept. The following diagram shows the structure of a 6800 GPU:

GeForce 6 Series GPU Block Diagram

The list below is showing the graphic processing procedure:
1. Vertex Shader

The Vertex Shader is a programmable processing function. It mainly used for 3D environment by adding effects onto the objects.

The purpose is to transform each vertex’s 3D position of a object (a graphic) into the 2D coordinate so it then able to display on the computing screen. In addition, Vertex Shaders can manipulate the position, texture coordinate and color, but note that it cannot create new vertices.

2. Geometry Shader

Geometry Shader is a shader program model which generates new graphics primitives like triangles, lines and points.

Geometry shader programs are executed after vertex shaders. It takes the whole rasterized primitive as input and pass the object fragments to the Pixel Shader.

3. Pixel Sahder

A Pixel Shader is a computation kernel function that computes color and other attributes of each pixel. Pixel Shaders range from always outputting the same color, to applying a lighting value, to doing bump mapping, shadows, specula highlights, translucency and other phenomena.

Working Cycle - The Pipeline technology in GPU

The GPU receives geometry information (3D data) from the CPU as an input and provides the picture/graphic (2D data) as an output. The process can be simply classify into three stages, they are: Application Stage, Geometry Stage and Rasterization Stage.

In the first Application Stage, the host interface is the communication bridge between the CPU and the GPU. It receives commands from the CPU and also the geometry information from the system memory. It then outputs a stream of vertices in object space with all their associated information (texture coordinates per vertex color etc.).

The next Geometry Stage, it receives vertices from the host interface in object space and outputs them in screen space.

In order to translate the scene from 3D data to 2D data, all the objects of a scene have to transform into various spaces, each with its own coordinate system before the 3D image can be projected onto a 2D plane. These transformations are applied on a vertex-to-vertex basis.

The pipeline then goes into the final stage - Rasterization. The GPU traverses the 2D image and convert the data into a number of "pixel-candidates", so-called fragments, which become pixels of the final image. A fragment is a data structure that contains attributes such as position, color, depth, texture coordinates, etc.

Besides, it is generated by checking which part of any given primitive intersects with which pixel of the screen. If a fragment intersects with a primitive, but not any of its vertices, the attributes of that fragment have to be additionally calculated by interpolating the attributes between the vertices. After that, further steps can be made to obtain the final pixels. Colors are calculated by combining textures with other attributes such as color and lighting or by combining a fragment with either another translucent fragment or optional fog (a graphical effect).

For the modern GPU, the rendering stages shown below:

Working cycle of GPU

The GeForce 6 Series GPU represents a classic architecture model, and the further series, the GeForce 8 Series, which maturely brings the GPU architecture to the next generation.

The turning point of GPU Architecture - Unified Shader Model

The architecture is not goes into the Programmable Pipeline. The GeForce 8 Series is then constructing with Built around programmable units and with the function of Unified shader. The Vertex, Geometry and Fragment parts are then become programmable and therefore, it causes to increase huge amount of programmability in the pipeline.

Shaders in GeForce 8 Series GPU:

GeForce 8 Series GPU Block Diagram

The two new features of Unified shaders and the allowance of directing access to compute units in new APIs map well to the nowadays famous application of GPU, which is general-purpose computing on graphics processing unit.

The architecture innovation

The trend of GPU architecture innovation is followed by the demand of the world which interested in general-purpose programmability and parallel processing.

The evolution of GPUs nowadays tends to allow user using the GPU to take over CPU tasks. When GPUs are used for non-graphical processing, it is known as GPGPU, or general-purpose computing on graphics processing unit.

CPUs are primarily serial processors. While they might have multiple cores and threads, they still function by performing many calculations very rapidly, one after the other. GPUs, on the other hand, have become increasingly parallel processors. While an Intel or AMD CPU might have 2, 4, or 8 cores, an Nvidia GeForce or ATI Radeon GPU can have hundreds, all working at the same time. The individual cores on a GPU perform relatively simple functions, but since they all work together simultaneously, they can perform some impressive mathematical feats.

Nowadays, the GPU performance growth still obeys to the Moore’s Law, which described the trend of the computing performance enhancement will have 200% increase every 18 months.

GPU performance graph 2000-2010:

GPU performance graph 2000-2010

Current and Future development

Computing is evolving from “central processing” on the CPU to “co-processing” on the CPU and GPU. The newly parallel computing architecture called “CUDA” which provided by NVIDIA, has further taken the used of GPGPU into the next generation.

For the scientific research, the CUDA GPU architecture accompany with the used of newly invented program (e.g. AMBER), the concept of GPGPU is then importantly apply into academia and pharmaceutical companies worldwide to accelerate new drug discovery.

On the other hand, the financial market, GPGPU is also used for counterparty risk application. The famous analysis helper, such as Numerix and CompatibL are also support the used of CUDA for GPGPU and achieved an 18X speedup.

GPU computing is going to the mainstream. The recent launches operation system, like Microsoft Window 7 and Apple Snow Leopard, the GPU is no longer only to be acted as a graphics processor, but also a general purpose parallel processor accessible to any application.

The world major chips manufacturer Intel, launched the first Core i3 on January 7, 2010. This type of CPU is integrated with a GPU. Examples are the Core i3-5xx and Core i3XXM. The Next series Core i5 also use the same architecture concept, such as Core i5-6XX, Core i5-5XXM and Core i5-5xxUM and so on.

Is our future going back to the old time using single-processor computing?

References

[1] Wikipedia entry on GPUs

http://en.wikipedia.org/wiki/GPU

[2] Kees Huizing, Han-Wei Shen: \The Graphics Rendering Pipeline"

http://www.win.tue.nl/~keesh/ow/2IV40/pipeline2.pdf

[3] Cyril Zeller: \Introduction to the Hardware Graphics Pipeline", Presentation at ACM SIGGRAPH

2005

http://download.nvidia.com/developer/presentations/2005/I3D/I3D_05_IntroductionToGPU.pdf

[4] ExtremeTech 3D Pipeline Tutorial

http://www.extremetech.com/article2/0,1697,9722,00.asp

[5] Ashu Rege: \Introduction to 3D Graphics for Games"

http://developer.nvidia.com/docs/IO/11278/Intro-to-Graphics.pdf