PD Dr. rer. nat. Josef Weidendorfer

Technische Universität München
Informatik 10 - Lehrstuhl für Rechnertechnik und Rechnerorganisation (Prof. Schulz)
Boltzmannstr. 3, 85748 Garching b. München
Tel.: +49 (89) 289-18454, josef.weidendorfer (at) tum.de

Leibniz Rechenzentrum der Bayerischen Akademie der Wissenschaften
Boltzmannstr. 1, 85748 Garching b. München
Tel.: +49 (89) 35831-8766, josef.weidendorfer (at) lrz.de

 

Short CV

Dr. Josef Weidendorfer is a qualified private lecturer at Technische Universität München (TUM) for Computer Science. He currently works at the Leibniz Computing Centre (LRZ) as senior HPC scientist, working on smooth migration strategies for future HPC systems and doing research in system level and performance analysis tools, as well as parallel programming models to support best exploitation of LRZ hardware. To this end, he maintains a tight cooperation to the chair of computer architecture and parallel systems (CAPS, Prof. Dr. M. Schulz) at TUM. Before, he was senior researcher and teaching assistant at TUM. Research involved best use of accelerators, heterogeneous computing, and tuning strategies for parallel code including dynamic code generation techniques. Josef did his habilitation at TUM in 2016 on simulation-driven performance analysis for parallel code, especially looking at capturing bottlenecks in the memory hierarchy of modern architectures and presenting them in a way to hint at adequate performance optimizations. He received his Ph.D. from TUM in 2003 for studying load balancing issues in car crash simulation on industrial code at BMW AG.

Research Interests

Parallel Computer architectures, High Performance Computing, Multi-/Manycore architectures, GPGPU, Performance analysis and optimization, Cache Simulation, Virtual Machines, dynamic code generation. Josef is interested in all kind of strategies towards improving efficiency of computations on various hardware structures, both on general purpose and specialized accelerator hardware (mostly towards HPC codes), including required tools (e.g. for performance analysis) and techniques on the SW/HW boundary (code generation, cache exploitation, ...). To this end, he regularly organizes the UCHPC workshop (since 2010 with Euro-Par) about usage of "unconventional" hardware ideas for HPC computing. Furthermore, being interested in performance analysis tools, he is co-organizer of the PSTI workshop series as well as the contact person at LRZ as member of the VI-HPS interest group.

He is maintainer of the open-source tools Callgrind/KCachegrind for cache simulation.

Memberships: ACM, GI, Zuse-Gesellschaft

Teaching

For student works (bachelor/master thesis, IDP, guided research), see here.

Or even better, please ask by mail for a meeting to discuss current open topics.

Winter Term 19/20

Winter Term 18/19

Winter Term 17/18

  • Lecture "Einführung in die Rechnerarchitektur" (introduction to computer architecture)
  • Master-Seminar: Programming models and code generation
  • Master-Seminar: "Hochleistungsrechner: Aktuelle Trends und Entwicklungen"

Summer Term 17

  • Master-Seminar Virtualization Techniques
  • Proseminar Multicore Architectures
  • Lab Course Efficient Programming of Multicore-Systems and Supercomputers
  • Seminar Trends in Computing

Winter Term 16/17

Summer Term 16

Winter Term 15/16

Summer Term 15

Winter Term 14/15

  • Lecture Virtualization Techniques
  • Seminar Programming models and code generation
  • Seminar Akzeleratorarchitekturen
  • Introduction to computer architecture: central exercise (microcode programming)

Supervised Student Work (recent)

  • Vincent Bode: Application-Integrated Fault Tolerance in HPC. Master Thesis, November 2019
  • Alexander Kurtz: Design and Implementation of a Lightweight Communication Backend for HPC/Distributed Applications, Master Thesis, Mai 2018
  • K. Pröll: Adaptive data layout optimizations for stencil-code using binary rewriting, Master Thesis, March 2018
  • T. Asheim: Evaluation of Binary Rewriting Techniques for MPI, Guided Research, April 2017
  • J. Rodrigues: Mutual Influence of Memory- and Compute-Intensive Parallel Applications - Characterization for Prediction, Master Thesis, Feb 2017
  • M. Eiler: Analysing and Using OpenCL for Processing Laser Scanning Data, Master Thesis, Aug 2016
  • M. Kruk: Evaluation of MPI vs. PGAS for Cache-optimized Benchmarks, Bachelor Thesis, Sep 2016
  • A. Engelke: Using LLVM to Optimize Binary Re-Writing at Runtime, Guided Research, Oct 2016
  • J. Rodrigues: Mutual Influence of Applications for Co-Scheduling, Guided Research, Oct 2016
  • D.A. Suarez Trujillo: Design and Implementation of a Feature Detection Algorithm for Space Debris Detection on the High Performance Data Processor (HPDP), Master Thesis, Oct 2015
  • D.A. Ortiz-Yepes: Page Migration Strategies on NUMA Systems Based on Sampling, Master Thesis, Nov 2015
  • T. Geissler: A tool for efficient analysis of Memory Access Behaviour of HPC Applications, Bachelor Thesis, Mar 2015
  • S. Bartels: Investigation of the Portability of an Image Processing Algorithm on a Reconfigurable Space-borne Parallel Processor, Master Thesis, Aug 2014
  • L. Kowalczyk: Design and Implementation of an Automatic Tuning Solution for GPU Programs, Master Thesis, Mai 2014
  • I. Vadasz: Hardware Transactional Memory for Cache Simulation, Master Thesis, April 2014
  • G. Kukreja: Host compiled simulation to estimate time and power consumption of embedded systems, Master Thesis, Nov 2014
  • S. Hertle: Adaptive Usage of Hardware Transactional Memory on Haswell Processors, Bachelor Thesis, Oct. 2014
  • J. Kranz: Generating Fast Code Generators, Interdisciplinary Project, 2013
  • M. Plichta: Faster Sparse Matrix Operations by Code Generation Embedding Prefetching, Bachelor Thesis, Aug. 2013

Research