In this project, we will develop parallel algorithms with low-memory footprints for the hypergraph-partitioning problem. These algorithms will be implemented on multicore CPUs and if time permits on manycore GPUs.
Problem: Today’s sequential hypergraph partitioning tools are excelled to produce high quality results. However, they are designed in a time, where the data is not that large and the need for quicker partitioning necessitates a different approach. Such approaches are investigated for graph partitioning. But there is an important gap in the literature for hypergraph partitioning problem. Currently, with the fastest and best sequential hypergraph partitioning tool we have today, we can model the 1.7GB amazon.com tensor data as a hypergraph and partition it to 1024 parts in 758 minutes by using 24GB of memory.
To alleviate the performance and memory bottlenecks, one can use a distributed-memory tool such as Zoltan. Even in this case, the amazon.com data we mentioned above can be partitioned in 65 minutes to 1024 parts by using 64 cores (8 nodes x 8 cores). Another interesting point for distributed-memory partitioners is that, to have a scalable one, which performs a small amount of node-to-node communication, one needs a well-partitioned hypergraph.
In practice, only the performance of partitioning does not mean much to assess its applicability. In the literature, it has been shown that hypergraph partitioning is a great asset to realize various performance requirements such as minimizing the communion among the nodes within a cluster or increasing the cache utilization for a shared-memory sequential or parallel application. For both cases, partitioning is nothing but a preprocessing step to improve the performance of the application. For this reason, it will only be useful when its impact on the reduced application time is more than its overall time. Otherwise, the preprocessing step will be useless. This should be the main motivation for new partitioners.
Objectives: Novel, parallel algorithms will be designed and they will be implemented on CPUs and GPUs. There will be three criterias for these algorithms: (a) Low–memory footprint: all the algorithms must use low memory (e.g., at most 2-3x of the original data size) (b) Scalability: since the increase on the core count will decrease the application time, the practical usability of a hypergraph partitioning algorithm will be less unless it is scalable. Hence, the performance of all the algorithms must increase when additional cores, devices are added. (c) Support for different partitioning metrics: different performance metrics, i.e., objective functions, such as bottleneck communication, maximum message, total message etc. are useful for different problems. So the algorithms will support all these metrics.
Novelty: (a) There are parallel tools to partition a hypergraph in distributed memory. However, as mentioned above, these tools are excelled to produce high quality partitions and very expensive. We will follow a different approach: we will develop tools using cheap and scalable algorithms with low-memory footprints. Hence the practical value of our algorithms and tools will be more since the partitioning itself is a preprocessing step. (b) There is no fully GPU-based hypergraph partitioner in the literature. On the contrary, the tool that will be developed in this project will be able to use a CPU-GPU combination.
Term:
2018-2019 Summer
Faculty Department of Project Supervisor:
Faculty of Engineering and Natural Sciences
Number of Students:
4
Related Areas of Project:
Computer Science and Engineering