Escolar Documentos
Profissional Documentos
Cultura Documentos
Parallel Task
A task that can be executed by multiple processors safely (yields correct results). Task
parallelism (also known as function parallelism and control parallelism) is a form of
parallelization of computer code across multiple processors in parallel computing
environments. Task parallelism focuses on distributing execution processes (threads)
across different parallel computing nodes. It contrasts to data parallelism as another form
of parallelism.
Serial Execution
Execution of a program sequentially, one statement at a time. In the simplest
sense, this is what happens on a one processor machine. However, virtually all
parallel tasks will have sections of a parallel program that must be executed
serially.
Parallel Execution
Execution of a program by more than one task, with each task being able to execute the
same or different statement at the same moment in time. The simultaneous use of more
than one CPU or processor core to execute a program or multiple computational threads.
Ideally, parallel processing makes programs run faster because there are more engines
(CPUs or cores) running it. In practice, it is often difficult to divide a program in such a
way that separate CPUs or cores can execute different portions without interfering with
each other. Most computers have just one CPU, but some models have several, and multi-
core processor chips are becoming the norm. There are even computers with thousands of
CPUs.
Note that parallel processing differs from multitasking, in which a CPU provides the
illusion of simultaneously executing instructions from multiple different programs by
rapidly switching between them, or "interleaving" their instructions.
Parallel processing is also called parallel computing. In the quest of cheaper computing
alternatives parallel processing provides a viable option. The idle time of processor cycles
across network can be used effectively by sophisticated distributed computing software.
Pipelining
Breaking a task into steps performed by different processor units, with inputs streaming
through, much like an assembly line; a type of parallel computing. In computing, a
pipeline is a set of data processing elements connected in series, so that the output of one
element is the input of the next one. The elements of a pipeline are often executed in
parallel or in time-sliced fashion; in that case, some amount of buffer storage is often
inserted between elements. As the assembly line example shows, pipelining doesn't
decrease the time for a single datum to be processed; it only increases the throughput of
the system when processing a stream of data.
High pipelining leads to increase of latency - the time required for a signal to propagate
through a full pipe.
A pipelined system typically requires more resources (circuit elements, processing units,
computer memory, etc.) than one that executes one batch at a time, because its stages
cannot reuse the resources of a previous stage. Moreover, pipelining may increase the
time it takes for an instruction to finish.
Shared Memory
From a strictly hardware point of view, describes a computer architecture where all
processors have direct (usually bus based) access to common physical memory. In a
programming sense, it describes a model where parallel tasks all have the same "picture"
of memory and can directly address and access the same logical memory locations
regardless of where the physical memory actually exists. In computing, shared memory
is memory that may be simultaneously accessed by multiple programs with an intent to
provide communication among them or avoid redundant copies. Depending on context,
programs may run on a single processor or on multiple separate processors. Using
memory for communication inside a single program, for example among its multiple
threads, is generally not referred to as shared memory.
Hardware architecture where multiple processors share a single address space and access
to all resources; shared memory computing. In computing, symmetric multiprocessing
or SMP involves a multiprocessor computer architecture where two or more identical
processors can connect to a single shared main memory. Most common multiprocessor
systems today use an SMP architecture. In the case of multi-core processors, the SMP
architecture applies to the cores, treating them as separate processors.
SMP systems allow any processor to work on any task no matter where the data for that
task are located in memory; with proper operating system support, SMP systems can
easily move tasks between processors to balance the workload efficiently. SMP has many
uses in science, industry, and business which often use custom-programmed software for
multithreaded (multitasked) processing. However, most consumer products such as word
processors and computer games are written in such a manner that they cannot gain large
benefits from concurrent systems. For games this is usually because writing a program to
increase performance on SMP systems can produce a performance loss on uniprocessor
systems. Recently, however, multi-core chips are becoming more common in new
computers, and the balance between installed uni- and multi-core computers may change
in the coming years.
The nature of the different programming methods would generally require two separate
code-trees to support both uniprocessor and SMP systems with maximum performance.
Programs running on SMP systems may experience a performance increase even when
they have been written for uniprocessor systems. This is because hardware interrupts that
usually suspend program execution while the kernel handles them can execute on an idle
processor instead. The effect in most applications (e.g. games) is not so much a
performance increase as the appearance that the program is running much more smoothly.
In some applications, particularly compilers and some distributed computing projects,
one will see an improvement by a factor of (nearly) the number of additional processors.
In situations where more than one program executes at the same time, an SMP system
will have considerably better performance than a uni-processor because different
programs can run on different CPUs simultaneously.
Systems programmers must build support for SMP into the operating system: otherwise,
the additional processors remain idle and the system functions as a uniprocessor system.
In cases where an SMP environment processes many jobs, administrators often
experience a loss of hardware efficiency. Software programs have been developed to
schedule jobs so that the processor utilization reaches its maximum potential. Good
software packages can achieve this maximum potential by scheduling each CPU
separately, as well as being able to integrate multiple SMP machines and clusters.
Access to RAM is serialized; this and cache coherency issues causes performance to lag
slightly behind the number of additional processors in the system.
Distributed Memory
In hardware, refers to network based memory access for physical memory that is
not common. As a programming model, tasks can only logically "see" local
machine memory and must use communications to access memory on other
machines where other tasks are executing.
Communications
Parallel tasks typically need to exchange data. There are several ways this can be
accomplished, such as through a shared memory bus or over a network, however
the actual event of data exchange is commonly referred to as communications
regardless of the method employed.
Synchronization
The coordination of parallel tasks in real time, very often associated with
communications. Often implemented by establishing a synchronization point
within an application where a task may not proceed further until another task(s)
reaches the same or logically equivalent point.
Synchronization usually involves waiting by at least one task, and can therefore
cause a parallel application's wall clock execution time to increase.
Granularity
In parallel computing, granularity is a qualitative measure of the ratio of
computation to communication.
Parallel Overhead
The amount of time required to coordinate parallel tasks, as opposed to doing
useful work. Parallel overhead can include factors such as:
Massively Parallel
Refers to the hardware that comprises a given parallel system - having many
processors. The meaning of "many" keeps increasing, but currently, the largest
parallel computers can be comprised of processors numbering in the hundreds of
thousands.
Embarrassingly Parallel
Solving many similar, but independent tasks simultaneously; little to no need for
coordination between the tasks. In parallel computing, an embarrassingly parallel
workload (or embarrassingly parallel problem) is one for which little or no effort is
required to separate the problem into a number of parallel tasks. This is often the case
where there exists no dependency (or communication) between those parallel tasks.[1]
Embarrassingly parallel problems are ideally suited to distributed computing and are also
easy to perform on server farms which do not have any of the special infrastructure used
in a true supercomputer cluster.
Scalability
Refers to a parallel system's (hardware and/or software) ability to demonstrate a
proportionate increase in parallel speedup with the addition of more processors.
Factors that contribute to scalability include:
Multi-core Processors
Cluster Computing
Use of a combination of commodity units (processors, networks or SMPs) to
build a parallel system. A computer cluster is a group of linked computers,
working together closely so that in many respects they form a single computer.
The components of a cluster are commonly, but not always, connected to each
other through fast local area networks. Clusters are usually deployed to improve
performance and/or availability over that provided by a single computer, while
typically being much more cost-effective than single computers of comparable
speed or availability.[1
Supercomputing / High Performance Computing
supercomputer (a mainframe computer that is one of the most powerful available at a
given time). Use of the world's fastest, largest machines to solve large problems. A
supercomputer is a computer that is at the frontline of current processing capacity,
particularly speed of calculation. Supercomputers introduced in the 1960s were designed
primarily by Seymour Cray at Control Data Corporation (CDC), and led the market into
the 1970s until Cray left to form his own company, Cray Research. He then took over the
supercomputer market with his new designs, holding the top spot in supercomputing for
five years (1985–1990). In the 1980s a large number of smaller competitors entered the
market, in parallel to the creation of the minicomputer market a decade earlier, but many
of these disappeared in the mid-1990s "supercomputer market crash".
The term supercomputer itself is rather fluid, and today's supercomputer tends to become
tomorrow's ordinary computer. CDC's early machines were simply very fast scalar
processors, some ten times the speed of the fastest machines offered by other companies.
In the 1970s most supercomputers were dedicated to running a vector processor, and
many of the newer players developed their own such processors at a lower price to enter
the market. The early and mid-1980s saw machines with a modest number of vector
processors working in parallel to become the standard. Typical numbers of processors
were in the range of four to sixteen. In the later 1980s and 1990s, attention turned from
vector processors to massive parallel processing systems with thousands of "ordinary"
CPUs, some being off the shelf units and others being custom designs. Today, parallel
designs are based on "off the shelf" server-class microprocessors, such as the PowerPC,
Opteron, or Xeon, and most modern supercomputers are now highly-tuned computer
clusters using commodity processors combined with custom interconnects.
Grid computing:
Grid computing is the most distributed form of parallel computing. It makes use of
computers communicating over the Internet to work on a given problem. Because of the
low bandwidth and extremely high latency available on the Internet, grid computing
typically deals only with embarrassingly parallel problems. Many grid computing
applications have been created, of which SETI@home and Folding@Home are the best-
known examples.[31]
Most grid computing applications use middleware, software that sits between the
operating system and the application to manage network resources and standardize the
software interface. The most common grid computing middleware is the Berkeley Open
Infrastructure for Network Computing (BOINC). Often, grid computing software makes
use of "spare cycles", performing computations at times when a computer is idling.