We'll be holding a CentOS Dojo at Oak Ridge National Labs, in Oak Ridge, Tennessee, on Tuesday, April 16th, 2019

REGISTER TODAY

Watch @CentOSProject on Twitter for updates

Schedule

Monday

Drinks and light hors d'oeuvres at The Casual Pint, 6-9pm

Tuesday

TIME

SESSION

9:00-10am

Keynote Jack Wells ORNL Directory of Science, ORNL

10-11am

Numan Laanait - Decoding Inverse Problems in Materials with Deep Learning on Summit (Details)

11-12pm

Alex Volkov - Scaling GPU workloads on datacenter clusters using GPU and MPI aware containers (Details)

12-12:30

Lunch

12:30-1pm

Optional Tours of Summit and Titan

1-2pm

Gerlof Langeveld - Practical use of Linux capabilities (Details)

2-3pm

Dalton Lunga (Spark + TF for Satellite Image Analysis), ORNL

3-4pm

Steve Smoogen - A Fist Full of Packages ... [what RPM brings to the game with modern software stacks] (Details)

4-5pm

Bryant Nelson, IBM Cognitive Software Engineer (Details)

7-10pm

Social at downtown location TBD

Abstracts

Dr. Numan Laanait - Decoding Inverse Problems in Materials with Deep Learning on Summit

Many challenging inverse problems permeate materials science and physics. The latest developments in machine learning promise to offer robust and accurate solutions to some of these age-old problems. Yet, questions of interest in the physical sciences carry far more complexity both computational and conceptual than are addressed by machine learning advances emanating from the tech sector. In this talk, I will report on research aimed at the development of new machine learning models and their deployment on supercomputers to address the scientific needs of materials science and physics. In particular, I will show that distributed deep learning, implemented on Oak Ridge National Lab’s Summit supercomputer (and scaled to 10,000 GPUs) gives promising results in “inverting” electron scattering data into the electron density of materials; an age-old inverse problem that has remained unsolved for nearly 80 years.

Dr. Numan Laanait is a computational and experimental physicist in the Computational Sciences and Engineering Division at Oak Ridge National Laboratory. His current research focuses on investigating the organizing principles of correlated systems, in both hard and soft matter, by developing novel computational techniques from the fields of machine learning and high-performance computing. Dr. Laanait joined Oak Ridge National Lab in 2014 as a Eugene P. Wigner Fellow and received a Ph.D. in Condensed Matter Physics from the University of Illinois in 2012. In the past, Dr. Laanait has led the development of open source software packages and the design of instruments at large scale experimental user facilities.

Gerlof Langeveld - Practical use of Linux capabilities

In conventional UNIX systems, processes running under a 'normal' user identity had no specific privileges whatsoever while processes running under the root identity had all special privileges, like the ability to reboot the system, to kill any process, to open raw sockets, etcetera. The capability mechanism implemented by the Linux kernel enables a process to get only a limited set of these privileges, just enough to do the special tasks that this process is supposed to do. Nowadays capabilities are used by systemd to provide specific privileges to services and by Docker to provide specific privileges to the process that is running in a container. Furthermore, capabilities are used as an alternative for setuid executables that enable normal users to run a specific program (like ping) under the root identity. In this presentation I will explain how the capability mechanism works and how systemd, containers and executable files are related to this feature.

Gerlof Langeveld is trainer/consultant for AT Computing in The Netherlands. He teaches courses about programming languages (like Python and C) and courses about the Linux operating system, like ‘Linux System Programming' and 'Linux Performance Analysis and Tuning’. Gerlof has been involved with performance analysis for more than 25 years and published various articles about this subject in technical magazines. He created and maintains the open source monitor program 'atop' (and the related kernel module 'netatop') that is available in the repositories of most Linux distributions and via the website https://www.atoptool.nl

Akex Volkov - Scaling GPU workloads on datacenter clusters using GPU and MPI aware containers.

Graphical Processing Units (GPUs) are critical to modern HPC (high performance compute) and ML/DL (machine learning/deep learning) computing workloads. Requirements of engineers and scientists can easily scale to petaflops whereas the current state of the art GPU performance is in teraflops range. Continuing in the tradition of cluster computing GPUs are scaled to petaflops performance by using traditional technologies such as MPI (message passing interface), high performance interconnects (such as infiniband and RoCE), and RDMA (remote direct memory access). The presentation will explore the challenges involved with multi-node scaling and how containerization is helping manage the software complexities of running workloads on clusters. An overview will be presented of how to orchestrate multinode workflows using GPU hardware and MPI using containers. The containers technology focus in the presentation will be on docker, singularity, HPC resource schedulers such as SLURM/PBS/etc., and container orchestration platforms such as Kubernetes.

Alex Volkov holds a Bachelors and Masters in Electrical Engineering from Illinois Institute of Technology (IIT). He worked at Northrop Grumman for 8 years as Systems Engineer developing and writing signal processing algorithms and at Exxon Mobilw for 3 years as High Performance Computing (HPC) programmer on seismic imaging projects. He is currently at NVIDIA working as Solutions Architect covering the Midwest and East regions. He support and works on technologies/software using GPUs: HPC, Deep Learning, accelerated analytics, etc.

Dr. Bryant Nelson

Dr. Bryant Nelson is a Cognitive Software Engineer at IBM. Dr. Nelson holds a PhD in computer science from Texas Tech University, a BS in computer science from Texas Tech University, and a BS in mathematics from Texas Tech University. Dr. Nelson’s early research was related to programming language design and automatic programming, specifically the automatic distribution of computation from a high-level declarative language. Dr. Nelson currently works on optimizing the IBM PowerAI offering, including the development and optimization of the PowerAI DDL library and toolset.

Stephen Smoogen - A Fist Full of Packages ... [what RPM brings to the game with modern software stacks]

Nodejs has NPM, Python has pip, Ruby has gem, and Rust has cargo.. and your summer grad students used all of those and some others on your research project. You now need to make sure that the code can be stored and rebuilt over the next 10+ years. We will go over various methods and things to look out for long term viability for your software choices, and how RPMs can be used to solve many auditing problems 10 years later.

Stephen Smoogen is a System Administrator who has been compiling and packaging software somehow since the early 1990's. He has worked for Los Alamos National Labs (twice), Sandia National Labs (once), and since 2009 has worked for Red Hat . He has helped administered multiple Unix/Linux networks for scientists, academics and [redacted] so has dealt with the many different problems that people in those spaces have faced and still face.

Dalton Lunga

Dalton Lunga is a Staff Scientist in Artificial Intelligence, Machine Learning and Geographic Data Sciences at the Oak Ridge National Laboratory. He currently leads AI research innovations in following areas: large-scale object detection for land use and land cover mapping, advanced workflows for AI deployment in high performance computing environments, large workflows for image search and retrieval, optimization methods and representation learning with large varying density data. Prior to joining ORNL, Dalton was a senior researcher at the Council for Scientific and Industrial Research (CSIR), South Africa, working on machine learning for natural language processing, manifold learning, and interactive visual analytics systems. Dalton received his MS. and Ph.D. in Electrical and Computer Engineering from Purdue University, West Lafayette as a Fulbright scholar. He serves on multiple professional organizations including his role as program chair for Artificial Intelligence and Geographic Knowledge Discovery (Sigspatial GeoAI Workshops)

Register!

Register to attend.

The event is free, but you need to register since access to our venue is security restricted.

Social Events

We will have two social gatherings - one the evening before the event, and one the evening of the event - where you can rub elbows with the speakers, other attendees, and local industry experts. Details of these events will be coming soon, but please plan your travel accordingly, if you are coming from out of town.

Where To Stay

Oak Ridge (15 min drive to visitor center where the event is):

West Knox (about 25-30 min drive but closer to food , night life, and things todo)

Downtown (pricey but lots to do! 30-45 min drive )

Events/Dojo/ORNL2019 (last edited 2019-05-13 16:24:32 by RichBowen)