An adventure on sparse data visualization and performance engineering

Term: 
2023-2024 Summer
Faculty Department of Project Supervisor: 
Faculty of Engineering and Natural Sciences
Number of Students: 
5

SparseViz is a low-code library designed to enhance the visualization and ordered sparse matrices and tensors without the need for direct coding from its users. It provides a user-friendly interface and tools that allow researchers and developers to explore sophisticated sparse data structures such as graphs, matrices and tensors easily and identify the reasons for performance bottlenecks. Overall, it aims to offer a comprehensive ecosystem for handling sparse data, emphasizing ease of use and integration with existing computational frameworks, thereby broadening the accessibility of advanced data processing techniques to a wider audience. The tool has been developed within a EU funded HORIZON project and the current repo is https://github.com/sparcityeu/sparseviz.
At the end of the project, you will learn what sparse data and sparsity are, and you will be familiar with sparse data structures to model graphs and tensors. Based on your role, you will implement kernels/functions over these data structures.
There are multiple tasks which require different skills and expertise. Here is what you need to have to be accepted to this project:

  • Excellent (for R1) / Good (for R2): C++ Skills: SparseViz is likely built on C++ to leverage its performance capabilities and memory management features, which are crucial for handling sparse data structures efficiently. Students must have an excellent grasp of C++ programming, including familiarity with its modern features (C++11/14/17 and beyond), to understand the library's core algorithms and potentially contribute to its development.

The rest of the skills depend on the roles that you will have:
R1: SparseViz Core Member

  • Good: OpenMP Knowledge: OpenMP is an API that supports multi-platform shared-memory parallel programming in C, C++, and Fortran, on most processor architectures and operating systems. Good knowledge of OpenMP is essential for students to understand and improve the parallel processing aspects of SparseViz, enabling it to handle large datasets more efficiently by utilizing multiple cores or threads.
  • Bonus: CUDA Programming: While not mandatory, knowledge of CUDA programming will be useful. CUDA allows for programming of NVidia GPUs, which can significantly accelerate the processing of sparse matrices and tensors. Students with CUDA programming skills will be well-equipped to enhance SparseViz's capabilities on large-scale data, leading to flexibility in performance optimization.

R2: SparseViz Wrapper Developer

  • Good Python, Matlab and/or Julia skills: To successfully create wrappers for SparseViz in Python, Julia, and MATLAB, a team member should have a deep understanding of C++ and the target languages, with some experience in cross-language binding tools like Cython, CFFI, or Pybind11 for Python, CxxWrap.jl for Julia, and MEX files for MATLAB. Proficiency in automated testing, continuous integration, and understanding bottlenecks  are pluses to ensure the reliability and efficiency of the wrappers. The development process also requires the ability to produce clear documentation and usage examples to facilitate easy adoption.

R3: SparseViz Visualization Engineer: 

  • Good: HTML, Python, and Plotly skills: Extending the visualization capabilities of SparseViz will enhance its analysis capabilities by providing interactive, customizable, and high-performance visualizations for sparse data structures. We are using interactive HTML charts produced with Plotly. Currently, sparsity is the main focus. This upgrade will transform SparseViz into a more comprehensive tool for performance data exploration and analysis, broadening its applicability across various research/development fields.

Please send a transcript/CV to the supervisor. You can find the contact details below.

Related Areas of Project: 
Computer Science and Engineering