|
| ( 01 May 2010 ) |
| By Sachin Pathak, Mentor Graphics |
|
The world is moving towards complex chip design. Various high-end devices are required for scientific computing, telecommunication, multimedia, graphics and consumer electronics. Such state of the art designs require complicated systems and high-end EDA tools. Current EDA tools and hardware platforms are over-exhausted to handle the flow. This is affecting the growth of the semiconductor industry and is restricting the move into the new era of electronics design. As such, the EDA industry has taken a wake-up call and has now been doing steps to develop parallel and multi-core computation capable IC design tools.
Discussions on technical computing are now focused on parallel programming as well as on customizing algorithms to utilize hardware effectively. Fueling these discussions are several factors related to high-performance systems, introduction of multicore/multiprocessor systems and the growing availability of computer clusters.
If we take a look at recent times, commercial high-end tools to support the development of technical computing applications for high-performance systems did not exist. Parallel programming was confined to a small group of people, and was considered as an art applied by specialists who focused on achieving maximum performance by using custom setups and by tuning their applications for specific hardware. Parallel programming solutions must focus beyond custom algorithms and performance. Parallel applications tools are now being developed that assist design engineers in designing, developing, debugging and evolving hardware.
To achieve success, we need to extend the functionality of standard serial architecture tools used to support parallel architecture multicore/multiprocessor tools, without extensively modifying the code, which could lead to a robust integrated development environment (IDE). Practically it is impossible to do it with minimum change in legacy code and that could force expensive software rewrites and shift market momentum to a new generation of EDA startups.
We all acknowledge that multicore support will be essential in the future, and all EDA vendors claim some multithreading and multicore capabilities today.
Leveraging multiple CPUs is nothing new in the EDA landscape. Some applications, particularly for functional and physical verification, use distributed processing over "farms" of networked workstations, potentially harnessing the power of dozens or hundreds of CPUs. Some of those applications have adopted multithreading to take advantage of workstations with multiple CPUs.
A few multithreaded applications, such as Mentor Graphics’ Calibre DRC physical verification tool, run equally well on distributed networks, multiple-CPU workstations and multicore CPUs. Customers are now placing dual-core and quad-core CPU-based workstations in compute farms, combining both distributed networks and multicore CPUs.
Challenges One of the difficulties with distributed processing is that the latency between processors is very high. Latency can be reduced with a workstation that contains multiple, single CPU chips, but the bus is a limiting factor. The greatest speed benefits come from multicore processors on the same die, even though there's little difference between multiple-CPU architectures and multicore architectures from a software implementation standpoint.
The programming for multicore architectures is hard, and trying to adapt legacy applications may prove fruitless. Usually, multicore programming involves the use of threads to distribute work and coordinate responses. From a software point of view, in particular in EDA, many algorithms are inherently sequential and show only limited gains when multithreaded. They will need to be rewritten.
Multicore EDA system design has some known drawbacks, which is a hot research topic in front of all of us.
An attempt to explicitly create complex EDA software programs are among the most complex task and could be a critical problem. There will be issues of poor support for threading from debuggers and tools and multithreading can create extremely hard-to-debug race conditions.
To build a threaded EDA architecture, software requires a ground-up development—and it’s hard to implement and really is something that could be considered “rocket science”.
Memory and data management will be a challenge for multicore/multiprocessor EDA software, while coding styles that employ global variables and do not separate data and execution will make the rewrite task difficult.
Apart from the significant overhead needed to partition problems into parallel tasks, there is also the post-processing overhead required to re-integrate the results.
There’s a narrow sweet spot for parallelism for EDA analysis tools. Synthesis tools, however, appear more resistant to parallelization because they deal with too many problems and data interdependencies.
If the workload isn't partitioned properly the communications overhead can swamp any gains brought about by parallelism.
Large EDA vendors acknowledge multicore’s challenges, but they are making good progress. Mentor Graphics, in fact, has one of the earliest multithreaded EDA products with Calibre—a state-of-the-art tool that features multicore capability.
Understand infrastructure cost for actual usages of multicore systems We have to understand Amdahl’s Law, which imposes a kind of a speed limit on parallel processing systems. Amdahl's law, also known as Amdahl's argument, is named after computer architect Gene Amdahl, and is used to find the maximum expected improvement to an overall system when only part of the system is improved. It is often used in parallel computing to predict the theoretical maximum speedup using multiple processors. The speedup of a program using multiple processors in parallel computing is limited by the time needed for the sequential fraction of the program.
For example, if a program needs 20 hours using a single processor core, and a particular portion of one hour cannot be parallelized, while the remaining promising portion of 19 hours (95 percent) can be parallelized, then regardless of how many processors we devote to a parallelized execution of this program, the minimal execution time cannot be less than that critical one hour. Hence the speed up is limited up to 20x.
The demand for high frequency and low power dissipation is the main driving force for multi-core configuration. In conclusion, the EDA industry must stay ahead of the curve. We must intensify the parallelism capability and also continuously focus on ease-of-use to enable better and faster design of multicore EDA tools, and enable time-to-market savings for our users.
Author Information Sachin Pathak is a Corporate Applications Engineer for Mentor Graphics. He can be reached at sachin_pathak@mentor.com.
|
| |
|
|
|
|
| |
|
|
| |
|
|
| |
|
|
| 25/4/2012 |
|
| 25/4/2012 |
|
| 25/4/2012 |
|
| |
|
|
|
|
|
|
|
| |
|
| |
|
| 30/3/2012 |
|
| 22/3/2012 |
|
| 1/3/2012 |
|
| |
|
|
|
|
|