Several efficient parallel runtimes are availabe in source-code form, along with models able to predict execution time or system utilization. At the moment, we provide two runtimes, one for NUMA architectures and one for integrated CPU/GPU processors. The packages are provided as an artifact.
Maximizing system utilization for co-located applications
Modeling the underlying NUMA architecture using a sophisticated queuing model, this framework is able to improve turnaround time and throughput of co-located parallel applications by maximizing the overall system througput. Click on the [artifact] link to access the softare.
Younghyun Cho, Camilo A.C. Guzman, and Bernhard Egger. "Maximizing System Utilization via Parallelism Management for Co-Located Parallel Applications." To appear in Proceedings of the the 2018 International Conference on Parallel Architectures and Compilation (PACT'18), Limassol, Cyprus, November 2018.
Younghyun Cho, Surim Oh, and Bernhard Egger. "Online Scalability Characterization of Data-Parallel Programs on Many Cores." In Proceedings of the the 2016 International Conference on Parallel Architectures and Compilation (PACT'16), Haifa, Israel, September 2016.
Maximizing throughput of OpenCL applications on integrated CPU/GPU architectures
Integrated CPU/GPU architectures (APUs) enable fast and efficient workload balancing on the cores of the CPU and the GPU. Our approach is completely online and does not require any offline performance characterization or prior application profiling.
Younghyun Cho, Florian Negele, Seohong Park, Bernhard Egger, and Thomas R. Gross. "On-The-Fly Workload Partitioning for Integrated CPU/GPU Architectures." To appear in Proceedings of the the 2018 International Conference on Parallel Architectures and Compilation (PACT'18), Limassol, Cyprus, November 2018.
SnuMAP: profiling parallel architectures
SnuMAP is an open-source application- and system profiler for parallel architectures. SnuMAP provides detailed execution trace information and easy visualization for one or multiple concurrent parallel applications that are executed on a multi/many-core platform.
A patch for Xen to enable space-efficient VM checkpointing is available here. The corresponding research papers detailing the method are
Bernhard Egger, Eunbyung Park, Younghyun Cho, Changyeon Jo, and Jaejin Lee. "Efficient Checkpointing of Live Virtual Machines." In IEEE Transactions on Computers (TC), Volume 65, Issue 10, pp. 3041 - 3054, January 2016.
Eunbyung Park, Bernhard Egger, and Jaejin Lee. "Fast and space efficient virtual machine checkpointing." In Proceedings of the ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE'11), Newport Beach, USA, March 2011.
Live migration modeling
We have developed an automatic, machine-learned model to predict several key metrics of live migration. To train the model, we have gathered around 50'000 data points of live migrations running different benchmarks. The data set and the machine learning model are available here, the details of our method are described in
Changyeon Jo, Youngsu Cho, and Bernhard Egger. "A Machine Learning Approach to Live Migration Modeling." In Proceedings of the 2017 ACM Symposium on Cloud Computing (SoCC'17), Santa Clara, USA, September 2017.
Changyeon Jo, Changmin Ahn, and Bernhard Egger. "A Machine Learning-based Approach to Live Migration Modeling." Presented at the 4th International Workshop on Efficient Data Center Systems (EDCS'16), Seoul, Korea, June 2016.