Expanding TCAD Simulations from Grid to Cloud

In one of our 2015 SISPAD papers, we analyzed distribution, execution and performance of typical TCAD simulations on traditional grid-based systems versus cloud systems. As turned out, a hybrid approach is the most efficient and cost-effective solution in many cases.
The necessary algorithms and interfaces are included in GTS Framework; the integrated Design Of Experiment (DOE) and Optimizer modules split simulation jobs for optimal performance in a distributed environment — the latter being especially useful for parameter fitting / inverse modeling.
Physical Device Simulation in a Multi-Host Environment
Power and Allocation of Resources
Grid-based systems, such as Sun Grid Engine (SGE) or Load Sharing Facility (LSF) are pervasive in the TCAD domain, and the used nodes are usually highly optimized for the required kind of computations. In contrast, typical cloud solutions are less optimized, but offer the ability to create additional computing hosts on-demand, tapping into a virtually unlimited pool of resources. This allows to allocate huge amounts of computational power just when needed, adding a major boost for time-critical projects — which can translate to a significant advantage in business.
Cost Structure
The cost structure of used cloud resources is typically different from the one of an on-premise grid-based system. On popular cloud services, billing granularity is one hour of node uptime, which can amount to largely varying total cost depending on the runtime of the submitted tasks and on the used scheduling algorithm (for details, refer to our paper below).