Loading…
Thursday, October 29
 

8:00am PDT

Registration & Breakfast
Registration and breakfast are held in the lobby outside the Salon rooms. Breakfast will include food and drinks.

Thursday October 29, 2015 8:00am - 9:00am PDT
Salon Lobby

9:00am PDT

Welcome
Opening address for the 2015 LLVM Developers' Conference.

Speakers
avatar for Tanya Lattner

Tanya Lattner

President, LLVM Foundation
President, LLVM Foundation


Thursday October 29, 2015 9:00am - 9:15am PDT
Salon III & Salon IV

9:15am PDT

WebAssembly: Here Be Dragons

WebAssembly is a tale of four browser vendors, seeking new languages and capabilities while staying fast, secure and portable. The old JavaScript wizard still has many spells under its belt, but it seeks a companion on its quest to reach VM utopia. WebAssembly is that companion. 

 

In this quest, mad alchemist Dan and jester JF will detail their exploration of LLVM-land. You’ll get to witness firsthand their exploration of ISel and MI, hear of their wondrous encounter with MC, and gasp at the Spell of Restructuring wherein SSA+CFG is transmuted into regs+AST. Will our adventurers conquer the Target and capture the virtual ISA? 

 

Join us in this exciting tale to which *you* are the hero!



Speakers
avatar for JF Bastien

JF Bastien

Compiler Engineer, Apple
JF Bastien is a compiler engineer, currently focusing on performance and security to bring portable, fast and secure code to the Web. JF is a member of the C++ standards committee, where his mechanical engineering degree serves little purpose. He's also the chair of the WebAssembly... Read More →
avatar for Dan Gohman

Dan Gohman

Mozilla
Cranelift. Wasmtime. Rust. WebAssembly. SIMD. Instruction Sets. Control Flow representations. Floating point determinism. NaNs. Did I say Cranelift?


Thursday October 29, 2015 9:15am - 10:00am PDT
Salon III & Salon IV

10:00am PDT

A Proposal for Global Instruction Selection

Our existing instruction selection framework, SelectionDAGISel (SDISel), has some fundamental limitations, including, but not limited to, slow compile time, basic block only scope, and monolithic approach. Over the years, we spent a lot of effort to workaround these limitations with more target hooks and more optimizations passes (e.g., CodeGenPrepare, ConstantHoisting) with their own problems (inaccurate heuristic, have to predict what the instruction selector will do, etc.) and limitations. 

We believe that it is time to come up with a new instruction selection framework, global-isel, that will solve these problems while offering new opportunities to improve our code generation. In this talk, we will present our plan to bring global-isel to LLVM.


Speakers
avatar for Quentin Colombet


Thursday October 29, 2015 10:00am - 10:45am PDT
Salon III & Salon IV

10:00am PDT

Input Space Splitting for OpenCL

OpenCL programs are prone to memory and control flow divergence. When implementing OpenCL for machines with explicit SIMD instructions, compilers can usually generate more efficient code if they can prove non-divergence of memory and branch instructions. To this end, they leverage a so-called divergence analysis. However, in practice divergence is often input-dependent and exhibited for some, but not all inputs. Hence, static analyses fail to prove non-divergence. To obtain good performance, developers can manually split the input space, however this is a tedious and error prone task. 

In this talk we present a new OpenCL to CPU compiler pipeline that addresses this problem by automatically ensuring divergence free control flow through program specialization.  To this end we represent the full kernel as well as the implicit work item dimensions in the polyhedral model. For data dependent control flow and non-affine expression overapproximation is used. From the polyhedral iteration domains and memory access functions we can then derive conditions for the absence of memory as well as control divergence.  Based one these conditions the input space is split in order to generate specialized kernel versions with beneficial divergence characteristics.  Commonly large parts of the input exhibit regular access and control patterns and only a fixed size boundary of the input space does not. In such cases we can achieve speedups almost as high as the used vectorization with. However, also for non-diverging kernels our technique can improve the performance due to simplifications in the polyhedral model. 



Speakers
avatar for Johannes Doerfert

Johannes Doerfert

Researcher/PhD Student, Saarland University


Thursday October 29, 2015 10:00am - 10:45am PDT
Salon I & Salon II

10:45am PDT

Break (15 Minutes)
AM Break

Thursday October 29, 2015 10:45am - 11:00am PDT
Salon Lobby

11:00am PDT

Sophisticated Program Analysis on LLVM IR

Over the years, many LLVM users have needed sophisticated program analysis passes that analyze LLVM IR. Some have wanted alias analysis passes such as Steensgaard and Anderson's while others have wanted a heap abstraction for performing backwards static slicing on values stored in the heap. Still others have asked for a call graph analysis that can resolve the targets of indirect function calls.

 LLVM lacks robust implementations of such analyses. While a few implementations exist, they are maintained by small groups outside of the LLVM source tree and are not of the same quality as the LLVM code. Furthermore, core LLVM optimizations have avoided using sophisticated program analyses, reducing the incentive for creating robust implementations.

 

The goal of this Birds-of-a-Feather session is to bring together users of sophisticated program analysis and LLVM developers to see what users need and whether there are common interests in creating, maintaining, and using these types of analyses. Tentative questions for discussion include:

o What program analyses do LLVM users and developers need that do no exist today?

o What obstacles are there to creating robust implementations of the analyses users need?

o Are there better methods of getting the information reported by these analyses (e.g., the way Type-Based Alias Analysis side-steps the need for Anderson’s alias analysis)?

o Would such analyses benefit existing LLVM optimizations while maintaining acceptable compile-time performance?


Speakers
JC

John Criswell

Assistant Professor, University of Rochester


Thursday October 29, 2015 11:00am - 11:45am PDT
Salon V & Salon VI

11:00am PDT

Beyond Sanitizers: guided fuzzing and security hardening

The Sanitizers (AddressSanitizer & friends) allow you to find many stability and security bugs in C++ code, but they are only as good as your tests are. In this talk we will show how to improve your test coverage with guided fuzzing (libFuzzer) and how to protect your applications in production even if some bugs are still there (Control Flow Integrity and SafeStack).


Speakers
avatar for Kostya Serebryany

Kostya Serebryany

Software Engineer, Google
Konstantin (Kostya) Serebryany is a Software Engineer at Google. His team develops and deploys dynamic testing tools, such as AddressSanitizer and ThreadSanitizer. Prior to joining Google in 2007, Konstantin spent 4 years at Elbrus/MCST working for Sun compiler lab and then 3 years... Read More →


Thursday October 29, 2015 11:00am - 11:45am PDT
Salon I & Salon II

11:00am PDT

Profile-based Indirect Call Promotion

Indirect call promotion (ICP) is the second most profitable profile-based optimization according to a recent study. This talk will present LLVM ICP pass that iterates over all indirect call sites in the module and selectively transforms them. We will discuss how subsequent optimizations in the compiler pipeline may benefit from ICP.


Speakers
avatar for Ivan Baev

Ivan Baev

Qualcomm Innovation Center


Thursday October 29, 2015 11:00am - 11:45am PDT
Salon III & Salon IV

11:45am PDT

Women in Compilers & Tools
This BoF will focus on the efforts we as a community and the LLVM Foundation can do to increase the participation of women in LLVM, compilers, and related tools. All (not just women) are encouraged to attend.

Speakers
avatar for Tanya Lattner

Tanya Lattner

President, LLVM Foundation
President, LLVM Foundation


Thursday October 29, 2015 11:45am - 12:30pm PDT
Salon V & Salon VI

11:45am PDT

A Heterogeneous Execution Engine for LLVM

Hexe, which stands for Heterogeneous Execution Engine, is an new compiler component that integrates with the LLVM infrastructure. It targets efficient computation on heterogeneous platforms by allowing the automatic offloading of workloads on computational accelerators, such as Graphics Processing Units (GPUs) or Digital Signal Processors(DSPs).

 

The workloads we consider for offloading are either explicitly annotated by the programmer or automatically detected by static compiler analysis and runtime checks. Our infrastructure operates at the level of LLVM intermediate representation and effectively supports multiple source languages.

 

Hexe consists of a set of compiler passes and a runtime environment. The compiler passes perform the required code analysis and transformations to enable workload offloading. The runtime environment manages data transfers and synchronization operations, and performs dynamic workload scheduling.

 

We consider a diverse set of heterogeneous systems ranging from mobile devices equipped with arm based multi-core CPUs, embedded GPUs and DSPs to data center nodes consisting of x86 multi-cores and high-end GPUs. Hexe has a modular design where new accelerator types and programming environments can be supported via a plugin interface. We also consider interoperability between Hexe and modern JIT technologies, such as LLVM MCJIT.



Speakers
avatar for Christos Margiolas

Christos Margiolas

The University of Edinburgh


Thursday October 29, 2015 11:45am - 12:30pm PDT
Salon I & Salon II

11:45am PDT

Automated performance-tracking of LLVM-generated code

Ensuring that top-of-trunk consistently generates high-quality code remains harder than it should be. Continuous integration (CI) setups that track correctness of top-of-trunk work pretty well today since they automatically report correctness regressions with low false positive rate to committers. In comparison, the output generated by CI setups that track performance require far more human effort to interpret. 

In this talk, I’ll describe why I think effective performance tracking is hard and what problems need solving, with a focus on our real world experiences and observations. 

As part of the bring-up of one of the public performance tracking bots, I’ve done an in-depth analysis of its performance and noise characteristics. The insights gained from this analysis drove a number of improvements to LNT and the test-suite in the past year. I hope that sharing these insights will help others in setting up low-noise performance-tracking bots. 

I’ll conclude by summarizing what seem to be the most important missing pieces of CI functionality to make the performance-tracking infrastructure as effective as the correctness-tracking infrastructure. 


Speakers
avatar for Kristof Beyls

Kristof Beyls

Senior Principal Engineer, Arm
compilers and related tools, profiling, security.


Thursday October 29, 2015 11:45am - 12:30pm PDT
Salon III & Salon IV

12:30pm PDT

Lunch
Lunch!

Thursday October 29, 2015 12:30pm - 2:00pm PDT
Salon Lobby

2:00pm PDT

Performance tracking & benchmarking infrastructure

A BoF on this topic has been held every dev meeting for the past couple of years. We’ve found these BoFs very useful in identifying the main pain points in using LLVM’s performance tracking infrastructure. Only a limited amount of effort is put into improving the infrastructure and the discussions held at previous BoFs have been very valuable in generating the best-bang-for-buck ideas. Every year, LLVM’s performance tracking infrastructure has improved substantially. Without having the previous BoFs, the improvements would have been less substantial.

 

Last year, the BoF mainly focussed on what was still needed infrastructure-wise to efficiently make good use of the performance data collected by performance tracking bots. A number of significant improvements have been made based on last years’ BoF discussions - but we’re not there yet: currently, it seems like the community is mostly ignoring llvm.org/perf numbers as it’s still too much effort to figure out what’s causing observed performance improvements or regressions. It seems like we’re close to having a good-enough infrastructure for many developers to start caring and actively using the performance-tracking infrastructure. Let’s discuss what improvements we really still need before we can efficiently make use of llvm.org/perf specifically and LNT-based performance tracking in general.


Speakers
avatar for Kristof Beyls

Kristof Beyls

Senior Principal Engineer, Arm
compilers and related tools, profiling, security.
TG

Tobias Grosser

ETH Zurich


Thursday October 29, 2015 2:00pm - 3:00pm PDT
Salon V & Salon VI

2:00pm PDT

Building, Testing and Debugging a Simple out-of-tree LLVM Pass

This tutorial aims at providing solid ground to develop out-of-tree LLVM passes. It presents all the required building blocks, starting from scratch: cmake integration, llvm pass management, opt / clang integration. It presents the core IR concepts through two simple obfuscating passes: the SSA form, the CFG, PHI nodes, IRBuilder etc. We also take a quick tour on analysis integration through dominators. Finally, it showcases how to use cl and lit to parametrize and test the toy passes developed in the tutorial.


Speakers
SG

Serge Guelton

QuarksLab
AG

Adrien Guinet

Quarkslab


Thursday October 29, 2015 2:00pm - 3:00pm PDT
Salon I & Salon II

2:00pm PDT

Hackers Lab
Thursday October 29, 2015 2:00pm - 6:00pm PDT
Salon III & Salon IV

3:00pm PDT

Profile Guided Optimization

Profile Guided Optimization (PGO) takes advantage of the program's runtime behavior to improve optimization decisions during code generation. In this BoF, we will discuss the existing PGO facilities in LLVM, missing features and plans for addressing them.



Speakers
avatar for Ivan Baev

Ivan Baev

Qualcomm Innovation Center


Thursday October 29, 2015 3:00pm - 4:00pm PDT
Salon V & Salon VI

3:00pm PDT

Creating an SPMD Vectorizer for OpenCL with LLVM

Processors such as CPUs or DSPs often feature SIMD instructions, but are not designed to efficiently support Single Program Multiple Data (SPMD) execution models such as OpenCL. The design of a compiler for such a target therefore needs some form of vectorization to generate the most optimal code for this kind of data-parallel execution model. This is because SPMD programs are most often written in scalar form with the implicit assumption that many instances of the program are executed in parallel. On CPU-like architectures, SIMD vector units can be leveraged for parallelism, such that each SIMD lane is loosely mapped to a program instance. 

 

This tutorial looks at how to create an SPMD vectorizer that targets CPU-like architectures for use with heterogeneous compute frameworks. OpenCL is used as an example but the concepts should translate to other frameworks such as CUDA, RenderScript or Vulkan Compute. While there are other possible approaches, we have chosen to present one that works at the LLVM IR level and that is essentially an IR pass that creates vectorized functions from the original scalar SPMD function. This allows targetting multiple architectures with very little architecture-specific code. 

 

We will start by briefly introducing the SPMD execution model, describing how it is used in OpenCL and giving an overview of what a SPMD vectorizer should do and how it differs from other kinds such as LLVM's loop vectorizer and SLP vectorizer. Then we will look at a possible vectorizer design, including the different vectorization stages (analysis, control-flow to data-flow, scalarization, packetization/instantiation and optimization/cleanup). Finally, we will look at some possible optimizations as well as other aspects that do not fit the 'stage-by-stage' presentation (e.g. vectorizing and scalarizing calls to builtin functions, SIMD width detection, interleaved memory access optimizations, SoA to AoS conversions, etc).



Speakers
PS

Pierre-Andre Saulais

Senior Principal Software Engineer, Codeplay Software
Pierre-Andre is a Senior Principal Software Engineer at Codeplay Software.


Thursday October 29, 2015 3:00pm - 4:00pm PDT
Salon I & Salon II

3:30pm PDT

Snacks
Thursday October 29, 2015 3:30pm - 4:30pm PDT
Salon Lobby

4:00pm PDT

LLVM Foundation
TBD

Speakers
avatar for Tanya Lattner

Tanya Lattner

President, LLVM Foundation
President, LLVM Foundation


Thursday October 29, 2015 4:00pm - 5:00pm PDT
Salon V & Salon VI

4:00pm PDT

LLVM Foundation
TBD

Speakers
avatar for Tanya Lattner

Tanya Lattner

President, LLVM Foundation
President, LLVM Foundation


Thursday October 29, 2015 4:00pm - 5:00pm PDT
Salon V & Salon VI

4:00pm PDT

Polly - Optimistic Loop Nest Optimizations with Schedule Trees

Polly is an advanced LLVM loop nest optimizer that provides precise memory access analyses and implements on top of them advanced loop optimizations based on a memory-access focused program model.

In the first part of this tutorial we introduce the audience to  integer set based schedule trees as a way to model loop programs. We explain how we statically model program behavior on the granularity of individual dynamic computations and discuss different program analyses (memory accesses, data-dependences, computational complexity).

We then learn how to perform complex loop transformations using simple per-node operations on an abstract program schedule tree. Such transformations include most classical loop transformations, but also full/partial tile separation, outer-loop vectorization and other more complex transformations. At the end of the first part of this tutorial, the audience understands the general concepts used in Polly.

The second part of this tutorial is focused on Polly's new optimistic optimization infrastructure that enables non-statically provable transformations to be performed optimistically. Discussing optimization blocking issues such as exception handling code, infinite loops, integer wrapping or out-of-bound memory accesses we introduce the concept of optimistic assumptions. We then discuss how such assumptions can be described in general, how Polly can collect assumptions, how redundant assumptions are eliminated and how a (close to) minimal run-time check to verifying them are generated. At the end of the second part of this tutorial the audience will be able to create optimistic loop optimizations even for cases that lack sufficient static information.



Speakers
avatar for Johannes Doerfert

Johannes Doerfert

Researcher/PhD Student, Saarland University
TG

Tobias Grosser

ETH Zurich


Thursday October 29, 2015 4:00pm - 5:00pm PDT
Salon I & Salon II

5:00pm PDT

Living Downstream Without Drowning

Have you made changes to your copy of an llvm.org project? Not planning to contribute them back to the open-source project right away? 

Then you are LIVING DOWNSTREAM. 

Have you noticed that there are actually quite a lot of changes made to the upstream projects? Clang + LLVM together see an average of 50 commits every day. This is a FLOOD. Are you seeing lots of conflicts or test failures when you merge from upstream? Spending too much time patching things back together before you can make any progress on your project?

Then you are DROWNING! 

On a project with lots of local changes, managing the flood can be a half-time job all by itself. It's not _exactly_ unproductive time, but it's time you do not spend on your unique project and customizations. At Sony Computer Entertainment, we were drowning... but we've learned to swim with the current, and we are building a lifeboat.

In this combined tech-talk/BOF session, Paul and Mike will talk about SCE's practices and plans for reducing our merge overhead, including source-patch practices and merge/build/test automation. Then, it becomes a BOF where everyone can share their ideas, suggestions and practices for Living Downstream Without Drowning!


Speakers
ME

Michael Edwards

Sony Computer Entertainment
PR

Paul Robinson

Sony Computer Entertainment


Thursday October 29, 2015 5:00pm - 6:00pm PDT
Salon I & Salon II

6:00pm PDT

Reception
Evening reception with food and drinks at SP2 Communal Bar & Restaurant (72 N. ALMADEN AVE. SAN JOSE, 95110).

Thursday October 29, 2015 6:00pm - 10:00pm PDT
SP2 72 N. ALMADEN AVE. SAN JOSE, 95110
 
Friday, October 30
 

8:00am PDT

Registration & Breakfast
Registration and breakfast will be held in the lobby outside the Salon rooms. Food and drink will be served.

Friday October 30, 2015 8:00am - 9:00am PDT
Salon Lobby

9:00am PDT

Swift's High-Level IR: A Case Study of Complementing LLVM IR with Language-Specific Optimization

The Swift programming language is built on LLVM and uses LLVM IR and the LLVM backend for code generation, but it also contains a new high-level IR called SIL to model the semantics of the language (and perform optimizations) at a higher level. In this talk, we discuss the motivations and applications of SIL, including high-level semantic analyses and transformations such as flow-dependent diagnostics, devirtualization, specialization, reference counting optimization, and TBAA, and we compare SIL's design with that of LLVM IR.


Speakers
JG

Joseph Groff

Apple Inc.
CL

Chris Lattner

Apple Inc.


Friday October 30, 2015 9:00am - 10:00am PDT
Salon III & Salon IV

10:00am PDT

Clang Static Analyzer

Discuss the current state of the clang static analyzer and potential future improvements. Potential topics include, but are not limited to bug triaging and suppression, integration of the static analyzer path sensitive reports into clang tidy, scan-build improvements, improvements to the analyzer core.


Speakers
AZ

Anna Zaks

Apple Inc.


Friday October 30, 2015 10:00am - 10:45am PDT
Salon V & Salon VI

10:00am PDT

LLVM Performance Improvements and Headroom

While LLVM is known for very fast compile-time, many developers in the community also push for improving run-time performance of generated code. This talk highlights this year’s performance gains on AArch64 in key benchmarks like SPEC2006, Kernels and also the llvm test suite. While progress has been impressive more work needs to be done. Therefore we will discuss future performance headroom which involves both expanding existing and architecting new optimizations.



Speakers

Friday October 30, 2015 10:00am - 10:45am PDT
Salon I & Salon II

10:00am PDT

Typeless Pointers in LLVM IR

In an effort to simplify and canonicalize LLVM IR surrounding pointer expressions, the type information from pointers is being removed. Hear about the current changes, utilities for updating your test cases, as well as current open questions and future work.


Speakers
DB

David Blaikie

Software Engineer, Google Inc.


Friday October 30, 2015 10:00am - 10:45am PDT
Salon III & Salon IV

10:45am PDT

Break (30 Minutes)
AM Break

Friday October 30, 2015 10:45am - 11:15am PDT
Salon Lobby

11:15am PDT

Policies for LLVM's C APIs

Today, LLVM's C API is more or less a stable API, but we don't have very well defined policies on how this works. Some users of llvm-c want stable API interfaces into various parts of the LLVM infrasture, others want further ABI guarantees about this usage, and still others simply want a way to bind to LLVM through their language frontend’s existing FFI support for C.

If we want to improve the situation for any of these users, we need to properly understand how these APIs are being used (or abused) today. This BoF will serve to allow the various stakeholders to explain what they need so that we can design future extensions properly and come up with well defined policies on how to maintain and evolve the C API.


Speakers

Friday October 30, 2015 11:15am - 12:00pm PDT
Salon V & Salon VI

11:15am PDT

Exception handling in LLVM, from Itanium to MSVC

This talk covers the design and implementation of MSVC-compatible exception handling in Clang and LLVM. Unlike the Itanium C++ exception handling model, the Windows exception handling model is not designed around successive unwinding. As a result, the existing LLVM landingpad instruction is insufficient for expressing how Windows exceptions should be handled. To support Windows exceptions, we added the new token type and a family of new EH pad instructions to LLVM. This talk describes the final design of the new representation and the tradeoffs we made along the way.


Speakers
RK

Reid Kleckner

Software Engineer, Google
I work on Clang, the C++ compiler. I specifically work on C++ ABI compatibility with MSVC, and other Windows-related issues in Clang.


Friday October 30, 2015 11:15am - 12:00pm PDT
Salon I & Salon II

11:15am PDT

Optimizing LLVM for GPGPU

This talk presents Google’s effort of optimizing LLVM for CUDA. When we started this effort, LLVM was well-tuned for CPUs but there had been little public work on improving its GPU performance. We developed, tuned, and augmented several general and CUDA-specific optimization passes. As a result, our LLVM-based compiler generates better code than nvcc on key end-to-end internal benchmarks and is on par with nvcc on a variety of open-source benchmarks.



Speakers
avatar for Jingyue Wu

Jingyue Wu

Software Engineer, Google Inc.


Friday October 30, 2015 11:15am - 12:00pm PDT
Salon III & Salon IV

12:00pm PDT

Polly - Loop Nest Optimizations in LLVM

Since the last US developers meeting the Polly loop optimization infrastructure has significantly evolved with significant progress

in terms of usability, applicability and compile time. Several regular developers joined our team from different places of the world and several of these started to extend and use Polly in new situations. In the last years both informal Polly meetups as well as officially announced BoFs have taken place at the US developer's meeting as well as EuroLLVM and these have been of great help to coordinate our efforts. As this year several core developers are attending the US developers meeting, we will have again an official BoF to discuss

with the wider LLVM community the impact of the recent changes we made in Polly, Polly's development agenda for the next years as well as its use in the wider LLVM community. Topics of interest might be: integration of Polly into the pass pipeline, interaction with other optimizations (vectorizer, classical loop optimizations), possible uses of Polly analyses, recent compile-time reduction, new optimizations,


Speakers
TG

Tobias Grosser

ETH Zurich
avatar for Sebastian Pop

Sebastian Pop

Samsung Austin R&D Center
Loop optimizations, testing, benchmarks, performance tracking.


Friday October 30, 2015 12:00pm - 12:45pm PDT
Salon V & Salon VI

12:00pm PDT

An update on Clang-based C++ Tooling

This talk is going to give an update of the C++ tooling we are building on top of clang. Among others, it will focus on clang-tidy, a tool to statically analyze source code to diagnose and fix typical programming errors like style violations, interface misuse, or bugs. We'll give an update on the direction this project is taking, new checks that are being integrated and challenges we are facing. 

In a live demo, we'll show how we can fix specific problems throughout LLVM's own codebase. We'll also show how a new check can be added in a matter of minutes and how other Clang-based tools can help with its development.



Speakers
MK

Manuel Klimek

Software Engineer, Google
Manuel Klimek is a software engineer at Google since 2008 and a professional code monkey since 2003. After developing embedded Linux terminals for the payment industry and distributed storage technology at Google in C++, he decided that C++ productivity lags significantly behind other... Read More →


Friday October 30, 2015 12:00pm - 12:45pm PDT
Salon I & Salon II

12:00pm PDT

OpenMP GPU/Accelerator support Coming of Age in Clang

GPU/Accelerator computing will be the basis for the future of Exacale computing through the DOE's CORAL project. It is also the basis for future features for C++ Std's SG14's Games Development/Low Latency/Real Time/Graphics Study Group. However, llvm currently lacks a unified platform-neutral infrastructure for offloading to GPUs/Accelerators, which severely limits clang/llvm usage in these hugely important application domains.

 

For the past several years, a number of contributors from AMD, Argonne National Lab., IBM, Intel, Texas Instruments, University of Houston and many others have come together to deliver OpenMP support to clang. OpenMP 3.1 is now officially in clang 3.7 and work continues to completion of OpenMP 4 aiming for clang 3.8. One of the most important features of OpenMP 4 standard is a vendor- and platform-neutral support for Accelerators.

 

 

The main presenters will be the OpenMP CEO and Chair of ISO C++'s SG5/SG14 along with the main developer who has been delivering OpenMP implementation in clang (with the help of many others). This talk will describe how GPU and Accelerators will be supported in clang. It offers an overview of the OpenMP 4 syntax, and a description of the upstreaming progress in both clang and llvm through this continued collaboration, as well as the offloading interface design that will describe target independent support across many hardware targets including Nvidia, Xeon Phi, ARM, and AMD devices.


Speakers
avatar for Michael Wong

Michael Wong

Distinguished Engineer, VP, Codeplay
Michael Wong is Distinguished Engineer/VP of R&D at Codeplay Software. He is a current Director and VP of ISOCPP , and a senior member of the C++ Standards Committee with more then 15 years of experience. He chairs the WG21 SG5 Transactional Memory and SG14 Games Development/Low Latency/Financials... Read More →


Friday October 30, 2015 12:00pm - 12:45pm PDT
Salon III & Salon IV

12:45pm PDT

Lunch
Lunch! There will be tables set in the lobby and more in adjacent rooms.

Friday October 30, 2015 12:45pm - 2:00pm PDT
Salon Lobby

2:00pm PDT

GPU Implementers

LLVM is rapidly gaining popularity as a compilation framework for graphics processors. We will host a Birds of a Feather session to discuss issues of interest to implementers of GPU targets in LLVM. Topics may include:

- Techniques for overcoming common challenges in adapting LLVM to GPU targets

- Future directions in LLVM to benefit GPU targets

- Opportunities for different GPU targets to share infrastructure and/or optimizations



Speakers
avatar for Owen Anderson

Owen Anderson

Manager, LLVM GPU Team, Apple


Friday October 30, 2015 2:00pm - 2:45pm PDT
Salon V & Salon VI

2:00pm PDT

LLVM for a managed language: what we've learned

For a little over a year we have been working towards a production quality, state of the art LLVM based JIT compiler for Java. This talk focuses on what we've learned about LLVM's strengths and weaknesses as an optimization framework for Java-like languages. We will discuss interesting challenges in efficiently implementing Java's semantics within LLVM IR, and how we've been growing LLVM towards being a more effective compiler for managed languages.



Speakers
avatar for Sanjoy Das

Sanjoy Das

Software Engineer, Azul Systems Inc
PR

Philip Reames

Azul Systems Inc


Friday October 30, 2015 2:00pm - 2:45pm PDT
Salon III & Salon IV

2:00pm PDT

Throttling Automatic Vectorization: When Less Is More

SIMD vectors are widely adopted in modern general purpose processors as they can boost performance and energy efficiency for certain applications. 

Compiler-based automatic vectorization is one approach for generating code that makes efficient use of the SIMD units, and has the benefit of avoiding hand development and platform-specific optimizations. 

The Superword-Level Parallelism (SLP) vectorization algorithm is the most well-known implementation of automatic vectorization when starting from straight-line scalar code, and is implemented in several major compilers. 

 

The existing SLP algorithm greedily packs scalar instructions into vectors starting from stores and traversing the data dependence graph upwards until it reaches loads or non-vectorizable instructions. 

Choosing whether to vectorize is a one-off decision for the whole graph that has been generated. 

This, however, is suboptimal because the graph may contain code that is harmful to vectorization due to the need to move data from scalar registers into vectors. 

The decision does not consider the potential benefits of throttling the graph by removing this harmful code. 

In this work we propose a solution to overcome this limitation by introducing Throttled SLP (TSLP), a novel vectorization algorithm that finds the optimal graph to vectorize, forcing vectorization to stop earlier whenever this is beneficial. 

Our experiments show that TSLP improves performance across a number of kernels extracted from widely-used benchmark suites, decreasing execution time compared to SLP by 9% on average and up to 14% in the best case. 


Speakers
VP

Vasileios Porpodas

University of Cambridge


Friday October 30, 2015 2:00pm - 2:45pm PDT
Salon I & Salon II

2:45pm PDT

Exception Handling in LLVM: the Windows/CLR and Itanium models

Come discuss the new exception handling constructs LLVM, how they enable targeting the Windows and CLR personality routines, how they differ from and relate to Itanium-style landing pads, the impact that SSA/IR constraints had on the design, and future directions. This session is for anyone interested in the IR model and/or exception handling semantics and codegen.


Speakers
avatar for Andy Kaylor

Andy Kaylor

Sr. Software Engineer, Intel
I've been a tools developer at Intel for 17 years and have been working with LLVM since 2012, contributing to areas such as MCJIT, LLDB and Windows exception handling. I'm about to dive into LLVM's representation and handling of floating point operations.
RK

Reid Kleckner

Software Engineer, Google
I work on Clang, the C++ compiler. I specifically work on C++ ABI compatibility with MSVC, and other Windows-related issues in Clang.
avatar for Joseph Tremoulet

Joseph Tremoulet

Principal Software Engineer, Microsoft


Friday October 30, 2015 2:45pm - 3:30pm PDT
Salon V & Salon VI

2:45pm PDT

LLVM back end for HHVM/PHP

The Hip-Hop Virtual Machine (HHVM) is a JIT compiler for executing PHP programs. It is used by some of the world’s largest websites such as facebook.com and wikipedia.org, among many others. At Facebook we have frequently been asked why we don't use LLVM as a back end for HHVM. Inspired by the success of Apple’s FTL we implemented an alternative back end using LLVM. 

In this talk we will share what it took to hook LLVM in to HHVM from conception to running limited production traffic. We will cover changes to our internal IR and modifications we had to make to LLVM. We will discuss performance challenges we faced, peculiar bugs, and finally will discuss why we are not yet at the point of enabling the LLVM back end for production Facebook traffic.



Speakers
BS

Brett Simmers

Software Engineer, Facebook, Inc.


Friday October 30, 2015 2:45pm - 3:30pm PDT
Salon III & Salon IV

2:45pm PDT

LoopVersioning LICM

Loop invariant code motion is an important compiler optimization and it moves invariant instructions out of a loop without affecting the semantics of a program. 

For safety it ensures the alias dependencies before moving invariant out of loop. 

In some cases memory aliasing may make this optimization ineffective. This results in possible missed opportunities in speeding up applications. 

 

LoopVersioning LICM is a step to exploit those missed opportunities where memory aliasing may make LICM optimization ineffective.


Speakers
avatar for Ashutosh Nema

Ashutosh Nema

Compiler Engineer, AMD


Friday October 30, 2015 2:45pm - 3:30pm PDT
Salon I & Salon II

3:30pm PDT

Poster Session
Poster session and PM snack break. The following posters will be on display:

Run Android with LLVM - Shuo Kang, SkyEye project of Tsinghua University

We have employed the JIT ability of LLVM as our dynamic translation engine of full system simulator. The project translates the ARM instruction to IR of LLVM and then use LLVM JIT to translate and run these IRs. By applying various optimization pass in LLVM and some smart policy, we have gotten the huge performance improvement of instruction execution. Now a complete android system can run on such full system simulator supported by LLVM. Even you can play Angrybird application smoothly on the simulator.a Comparing to the official Android simulator which build on Qemu, we found there is some performance incremental for most of android application.

OpenMP Support in Clang: to 4.0 and Beyond! - Alexey Bataev, Intel, Andrey Bokhanko, Intel, Sergey Ostanevich, Intel

OpenMP is well-known and widely used Application Programming Interface for shared-memory parallelism. A project to implement OpenMP support is carried out by a lot of people from AMD, Argonne, IBM, Intel, Texas Instruments, University of Houston and other organizations – including several members of OpenMP Architecture Review Board.

Full implementation of OpenMP 3.1 support was released with clang 3.7. It proved to be a popular choice among C++ programmers looking for a compiler combining all the clang virtues with OpenMP capabilities. 

OpenMP continues to evolve. OpenMP 4.0 version of the standard was published a couple of years ago and introduced a host of improvements, most notably support for computation offloading. We will elaborate on current progress of implementation of these new features in clang and highlight design of offloading support in llvm compiler, recently proposed by us and our colleagues to the community.

The upcoming OpenMP 4.1 adds even more interesting features, like extended offloading, taskloops, new worksharing/simd clauses, etc. We will explain our plans regarding support for all these new features as well.

Evaluation of Core Tuning Options (-mcpu) in LLVM - Minseong Kim, Samsung Electronics, Hyeyeon Chung, Samsung Electronics, Taekhyun Kim, Samsung Electronics

Modern compilers like LLVM and GCC provide core tuning options that enable the generation of highly tuned code for the underlying H/W. This poster presents our observation on performance of various tuning options, and then presents guidelines for the proper use of core tuning options. The poster also discusses our findings on room for improvement in LLVM tuning options.

Sampling for data races
- Peter Goodman, Trail of Bits, Angela Demke Brown, University of Toronto, Ashvin Goel, University of Toronto

Race Sanitizer (RSan) is a new data race detector that implements a variant of the DataCollider algorithm. RSan avoids the complexity and overhead of tracking memory access interleavings by instrumenting a subset of loads/stores. RSan introduces scheduling delays at instrumented loads/stores to detect data races. RSan avoids the issue of determining what memory will suffer from racy accesses by leveraging type-awareness to uniformly sample all memory locations. RSan has been implemented as an LLVM module pass and an efficient runtime system.

Code Clone Detection in Clang Static Analyzer - Kirill A. Bobyrev, MIPT, Vassil Vassilev, CERN

The copy-paste is a common programming practice. Most of the programmers start from a code snippet, which already exists in the system and modify it to match their needs. Easily, some of the code snippets end up being copied dozens of times. This manual process is error prone, which leads to a seamless introduction of new hard-to-find bugs. Also, copy-paste usually means worse maintainability, understandability and logical design. Clang and Clang’s static analyzer provide all the building blocks to build a generic C/C++ copy-paste detecting infrastructure.
Large codebases may contain from 5% to 20% of identical code pieces, which leads to all the mentioned problems. My GSoC project introduces Code Clone Detection to Clang Static Analyzer and allows processing large projects in order to find duplicates.

Automatically finding and patching bugs in binary software using LLVM - Ryan Stortz and Jay Little of Trail of Bits

As part of DARPA’s Cyber Grand Challenge, we utilized McSema to translate executables into LLVM IR. We then built and extended tools on the LLVM toolchain that allowed us to automatically discover and patch exploitable software vulnerabilities. This poster presents our system, its capabilities and limitations, and our CGC results.

Molly - Parallelizing for Distributed  Memory  using LLVM - Michael Kruse

Motivated by Lattice Quantum Chromodynamics applications, Molly is an LLVM compiler extension, complementary to Polly, which optimizes the distribution of data and work between the nodes of a cluster machine such as Blue Gene/Q. Molly represents arrays using integer polyhedra and uses another already existing compiler extension Polly which represents statements and loops using polyhedra. When Molly knows how data is distributed among the nodes and where statements are executed, it adds code that manages the data flow between the nodes. Molly can also permute the order of data in memory. 

Gigabyte/Second Unicode RegEx Matching with Parabix/LLVM Nigel W. Medforth and Robert D. Cameron

Building on the Parabix transform representation of text and LLVM mcjit, icGrep offers dramatically accelerated regex search compared to byte-at-a-time alternatives, as well as superior Unicode support.



Friday October 30, 2015 3:30pm - 4:30pm PDT
Salon Lobby

4:30pm PDT

Managed languages targeting LLVM

This BoF will focus on the needs of managed language (such as Java, C#, JavaScript, and Python) and how to compile various high level features efficiently using the LLVM toolchain. Topics likely to be covered include: deoptimization and speculative compilation, frame introspection, range check/null check elimination optimizations, implicit null pointer exceptions, lazy materialization of IR during optimization, exception handling integration, and patchable code. This BoF will not cover garbage collection since there's another BoF planned for that topic.



Speakers
avatar for Sanjoy Das

Sanjoy Das

Software Engineer, Azul Systems Inc
PR

Philip Reames

Azul Systems Inc
avatar for Joseph Tremoulet

Joseph Tremoulet

Principal Software Engineer, Microsoft


Friday October 30, 2015 4:30pm - 5:15pm PDT
Salon V & Salon VI

4:30pm PDT

Compiling large, real-world codebases with clang on Windows

llvm 3.7 is the first release that can build large projects such as Chromium on Windows without having to fall back to Visual Studio's compiler for a single translation unit. This talk gives an overview of the work done to get to this state: It covers language extensions clang needed to learn to parse Microsoft's headers and dark corners of the Microsoft ABI, with a focus on work done in the last year. Much of the Windows support was developed in tight collaboration between the Chromium and LLVM projects. The talk also touches on how this collaboration works and why it’s successful. Finally, the talk also gives an overview of how to get projects building with clang that build with Visual Studio.



Speakers

Friday October 30, 2015 4:30pm - 5:15pm PDT
Salon III & Salon IV

4:30pm PDT

Debug Info: From Metadata to Modules

The efficiency of debug info in LLVM and Clang improved dramatically this year.  This talk is about what it took to get here and what work remains.

 

We'll talk about how Metadata was redesigned to make the debug info IR memory-efficient (with a human-readable assembly syntax).  We'll go into the implications for other Metadata graphs, and what a more expressive Metadata future could look like.  We'll also include an overview of what's left to scale debug info for LTO.

 

We'll also talk about Clang's new module debugging feature, which reduces the size of debug info on disk, improves compile time, and makes full type information available to debuggers.  We'll highlight how Clang-based debuggers like LLDB can use module debug information to enhance expression evaluation.


Speakers
avatar for Adrian Prantl

Adrian Prantl

Apple
Ask me about debug information in LLVM, Clang and Swift!


Friday October 30, 2015 4:30pm - 5:15pm PDT
Salon I & Salon II

5:15pm PDT

Lightning Talks
This session consists of a series of talks, each 5 minutes long. Here is the lightning talk schedule:

The recent switch lowering improvements - Hans Wennborg, Google

Earlier this year, the DAG switch lowering was rewritten to improve the performance of code generated for switches. The new algorithm always generates balanced trees, is better at finding jump tables, and can exploit profile information. This lightning talk would give a walk-through of the new switch lowering.

ds2, a tiny debug server used with lldb - Stephane Sezer, Facebook

This talk will present ds2, a debug server that we use in conjunction with lldb to do remote debugging at Facebook. It currently supports remote debugging on Linux/android/Tizen and Windows as well as FreeBSD support is under development. 

This debug server's small size and the fact that it depends only on libc++ make it an ideal candidate to include in embedded platforms where space is limited. Source is available here: https://github.com/facebook/ds2

Accelerating Stateflow with LLVM - Dale Martin, Mathworks, Inc., Ramkumar Ramachandra, MathWorks, Inc.

Learn how MathWorks has improved the customer experience for Stateflow users using LLVM for JIT-based simulation. We will discuss how we translate from our high-level IR to LLVM's low-level IR and use it for fast starting, high-performance simulation. We will also discuss a number of challenges and shortcomings we faced with the LLVM infrastructure


Putting Debug Info on a Diet
- David Blaikie, Google Inc

Debug info size is... sizable. Two years ago, Clang's debug info was up to twice as large as GCC's, after 9 months, it was nearly half the size. Where and how did we cut the fat?


Large scale libc++ deployment
- Evgenii Stepanov & Ivan Krasin, Google

This talk presents author’s experience switching a large, quickly changing codebase from libstdc++ to libc++. We list common problems, solutions and ideas for future libc++ improvements.


To LLVM Bytecode Obfuscation and Beyond - Serge Guelton, Quarkslab, Adrien Guinet, Quarkslab

An introduction to LLVM pass building when working out-of-tree, through 3 simple obfuscating passes. Featuring building and using a custom analysis, tracing your passes, and test them with lit!


An Implementation of Swing Modulo Scheduling in a Production Compiler - Brendon Cahoon, Qualcomm

In this talk, we present the implementation and evaluation of a machine level software pipelining optimization pass based on Swing Modulo Scheduling. Our software pipelining implementation improves performance by 20% on a set of image processing kernels.


Integer Vector Optimizations and "Usual Arithmetic Conversions
- Stephen Rogers, Movidius


Speakers
DB

David Blaikie

Software Engineer, Google Inc.
SG

Serge Guelton

QuarksLab
DM

Dale Martin

MathWorks
avatar for Stephane Sezer

Stephane Sezer

Software Engineer, Hudson River Trading


Friday October 30, 2015 5:15pm - 6:00pm PDT
Salon I & Salon II

5:15pm PDT

Advances in Loop Analysis Frameworks and Optimizations

The talk will survey recent advances in loop analysis frameworks to support optimizations like unrolling, distribution, loop-aware load-elimination and multi-versioning. A significant part of our contribution was to rethink and re-design existing analysis frameworks to make them both more powerful and more widely accessible. The major part of this talk will focus on introducing these analysis frameworks and how they are used by optimizations. We will also discuss how they integrate with other analysis passes and outline ideas for their future evolution.



Speakers
AN

Adam Nemet

Apple Inc.


Friday October 30, 2015 5:15pm - 6:00pm PDT
Salon III & Salon IV
 
Filter sessions
Apply filters to sessions.