We'll be holding a CentOS Dojo at Oak Ridge National Labs, in Oak Ridge, Tennessee, on Tuesday, April 16th, 2019






Keynote - John Turner, Group Leader of the Computational Engineering & Energy Sciences Group (CEES) (Details)


Bryant Nelson, IBM Cognitive Software Engineer (Details)


Steve Smoogen - A Fist Full of Packages ... [what RPM brings to the game with modern software stacks] (Details)




Optional Tours of Summit and Titan


Alex Volkov - Scaling GPU workloads on datacenter clusters using GPU and MPI aware containers (Details)


Dalton Lunga - Spark + TF for Satellite Image Analysis (Details)


Gerlof Langeveld - Practical use of Linux capabilities (Details)


Doug Fuller - The Ceph Storage System: A Technical Overview (Details)


Social at downtown location TBD


John Turner - The Changing Landscape of Large-scale Computational Science

We’ll discuss the rise of heterogeneous architectures for high-performance computing platforms over the last decade, the convergence of large-scale data analytics and traditional computational science, and now the influence and impact of machine learning and artificial intelligence on current and future HPC systems. An overview of the Department of Energy’s Exascale Computing Project, including hardware, software, and applications, will be included.

Dr. John A. Turner has almost 25 years of experience applying computational science to challenging problems ranging from nuclear energy and stockpile stewardship to battery safety and additive manufacturing. He is a Distinguished R&D Staff Member at Oak Ridge National Laboratory (ORNL), serves as Group Leader for Computational Engineering & Energy Sciences (CEES), and has several roles in the Exascale Computing Project, including Principal Investigator for the Exascale Additive Manufacturing (ExaAM) project. John received his Ph.D. in Nuclear Engineering from North Carolina State University in 1990. Until 1997 he worked at Los Alamos National Laboratory (LANL), when he joined Blue Sky Studios, earning credits on the Academy Award nominated feature film “Ice Age” and the Oscar-winning short animated film “Bunny”. In 2001 Dr. Turner returned to LANL and became Group Leader of the Computational Physics Group (CCS-2). He also led the application team for the Roadrunner supercomputer, which combined standard processors with enhanced versions of the IBM Cell processor used in PlayStation 3 game consoles and was the first system to achieve sustained performance exceeding 1 PetaFlop/s. In 2008 John moved to ORNL to form CEES, a new group focused on developing and applying advanced simulation tools to applications such as nuclear energy, electrical energy storage, and additive manufacturing.

Gerlof Langeveld - Practical use of Linux capabilities

In conventional UNIX systems, processes running under a 'normal' user identity had no specific privileges whatsoever while processes running under the root identity had all special privileges, like the ability to reboot the system, to kill any process, to open raw sockets, etcetera. The capability mechanism implemented by the Linux kernel enables a process to get only a limited set of these privileges, just enough to do the special tasks that this process is supposed to do. Nowadays capabilities are used by systemd to provide specific privileges to services and by Docker to provide specific privileges to the process that is running in a container. Furthermore, capabilities are used as an alternative for setuid executables that enable normal users to run a specific program (like ping) under the root identity. In this presentation I will explain how the capability mechanism works and how systemd, containers and executable files are related to this feature.

Gerlof Langeveld is trainer/consultant for AT Computing in The Netherlands. He teaches courses about programming languages (like Python and C) and courses about the Linux operating system, like ‘Linux System Programming' and 'Linux Performance Analysis and Tuning’. Gerlof has been involved with performance analysis for more than 25 years and published various articles about this subject in technical magazines. He created and maintains the open source monitor program 'atop' (and the related kernel module 'netatop') that is available in the repositories of most Linux distributions and via the website https://www.atoptool.nl


Alex Volkov - Scaling GPU workloads on datacenter clusters using GPU and MPI aware containers.

Graphical Processing Units (GPUs) are critical to modern HPC (high performance compute) and ML/DL (machine learning/deep learning) computing workloads. Requirements of engineers and scientists can easily scale to petaflops whereas the current state of the art GPU performance is in teraflops range. Continuing in the tradition of cluster computing GPUs are scaled to petaflops performance by using traditional technologies such as MPI (message passing interface), high performance interconnects (such as infiniband and RoCE), and RDMA (remote direct memory access). The presentation will explore the challenges involved with multi-node scaling and how containerization is helping manage the software complexities of running workloads on clusters. An overview will be presented of how to orchestrate multinode workflows using GPU hardware and MPI using containers. The containers technology focus in the presentation will be on docker, singularity, HPC resource schedulers such as SLURM/PBS/etc., and container orchestration platforms such as Kubernetes.


Alex Volkov holds a Bachelors and Masters in Electrical Engineering from Illinois Institute of Technology (IIT). He worked at Northrop Grumman for 8 years as Systems Engineer developing and writing signal processing algorithms and at Exxon Mobilw for 3 years as High Performance Computing (HPC) programmer on seismic imaging projects. He is currently at NVIDIA working as Solutions Architect covering the Midwest and East regions. He support and works on technologies/software using GPUs: HPC, Deep Learning, accelerated analytics, etc.

Dr. Bryant Nelson

Dr. Bryant Nelson is a Cognitive Software Engineer at IBM. Dr. Nelson holds a PhD in computer science from Texas Tech University, a BS in computer science from Texas Tech University, and a BS in mathematics from Texas Tech University. Dr. Nelson’s early research was related to programming language design and automatic programming, specifically the automatic distribution of computation from a high-level declarative language. Dr. Nelson currently works on optimizing the IBM PowerAI offering, including the development and optimization of the PowerAI DDL library and toolset.


Stephen Smoogen - A Fist Full of Packages ... [what RPM brings to the game with modern software stacks]

Nodejs has NPM, Python has pip, Ruby has gem, and Rust has cargo.. and your summer grad students used all of those and some others on your research project. You now need to make sure that the code can be stored and rebuilt over the next 10+ years. We will go over various methods and things to look out for long term viability for your software choices, and how RPMs can be used to solve many auditing problems 10 years later.

Stephen Smoogen is a System Administrator who has been compiling and packaging software somehow since the early 1990's. He has worked for Los Alamos National Labs (twice), Sandia National Labs (once), and since 2009 has worked for Red Hat . He has helped administered multiple Unix/Linux networks for scientists, academics and [redacted] so has dealt with the many different problems that people in those spaces have faced and still face.


Dalton Lunga

Dalton Lunga is a Staff Scientist in Artificial Intelligence, Machine Learning and Geographic Data Sciences at the Oak Ridge National Laboratory. He currently leads AI research innovations in following areas: large-scale object detection for land use and land cover mapping, advanced workflows for AI deployment in high performance computing environments, large workflows for image search and retrieval, optimization methods and representation learning with large varying density data. Prior to joining ORNL, Dalton was a senior researcher at the Council for Scientific and Industrial Research (CSIR), South Africa, working on machine learning for natural language processing, manifold learning, and interactive visual analytics systems. Dalton received his MS. and Ph.D. in Electrical and Computer Engineering from Purdue University, West Lafayette as a Fulbright scholar. He serves on multiple professional organizations including his role as program chair for Artificial Intelligence and Geographic Knowledge Discovery (Sigspatial GeoAI Workshops)

Doug Fuller

Ceph is an object-based, software-defined clustered storage system with high availability. Providing object, file, and block storage interfaces as well as a native API, Ceph can serve many different types of storage roles individually or simultaneously. This presentation will include a technical overview of Ceph and its capabilities, with ample time for technical deep diving by request.

Douglas Fuller is a principal software engineer at Red Hat working on Ceph storage. Prior to joining Red Hat, Doug had a 10-year career in high-performance computing including at Oak Ridge National Laboratory in the NCCS Technology Integration Group. He obtained his bachelor's and master's degrees from Iowa State University, performing his master's research work at DOE's Ames Laboratory in the Scalable Computing Laboratory.

Social Events

We will have two social gatherings - one the evening before the event, and one the evening of the event - where you can rub elbows with the speakers, other attendees, and local industry experts. Details of these events will be coming soon, but please plan your travel accordingly, if you are coming from out of town.

Where To Stay

Oak Ridge (15 min drive to visitor center where the event is):

West Knox (about 25-30 min drive but closer to food , night life, and things todo)

Downtown (pricey but lots to do! 30-45 min drive )

Events/Dojo/ORNL2019 (last edited 2019-05-13 16:24:32 by RichBowen)