Helen’s Notes and Links from EMiT 2019
All talk slides can be found here
Intro:
Prof. Liz Towns-Andrews: touched upon industrial strategy - AI funding increasing via gov investment. NB there’s a Yorkshire industrial strat group. Shout out to 3Mbic and NPL collaborations at Huddersfield.
Keynote: data flow supercomputer big data
Prof. V. Milutinovic home.etf.rs/~vm/ISCUltimateDataFlow.ppt
inspiration sources all have famous books Feynman’s notes on computing Von Neumann paradigm talks about relation on time to calculate vs time to communicate. Now we computer much faster than can communicate - moving to novel architectures moving to multiscale approach using aSoGs via FPGAs with maxeler. Used in physics computational code - bad mouthing Fortran MPI - suggesting that these methods could be helpful with a willingness to move technique for existing code.
**See presentation for paper reading suggestions **
Like all these new hardware solutions, have their own set of coding languages here prefaced with max.
highlights Kolmogorov complexity
has geophysics applications:
Schlumberger seismic data acquisition: Nemeth et al. seg 2008 implementation of the acoustic wave equation of FPGA - allows lower energy consumption to do calculations on the vessel
slide 34 gives weather forecasting example from internship students - see later slides for more applications appgallers.maxeler.com. mi.sanu.ac.rs/~appgallery.maxeler
Possible to get access to the machine in London
potential increase in HPC quality of life impacts - supercomputers are huge and energy-hungry - control flow is very energy costly compared to the future possible data flow technology.
currently looks not sure open source
Stands and posters
National Physic Laboratory had a dashboard set up with sensors on a turbine
used node.redand suggested https://thingsboard.io/ as dashboard interfaces for easy drag and drop setups. Also suggested graphly services.
Novel hardware
Mike Ashworth exascale project - weather forecasting
FPGA offer large performance per W gains
LFRic - Dynamics from GungHo, Scalability and Accuracy focusing. Built on the concept of “separation of concerns” i.e. separating scientific code from parallel code - also speaks about PsyClone.
Helmholtz solver saw ~50%. Speed up.
requires either c code vivado hls or opmss bsc, or the max j or open cl etc.
focuses on the c in vivado hls to rewrite the UM in C on zynQ ultrascale blocks
currently not as fast but MUCH cheaper. Note Breton white paper on gpu vs FPGA performance/price.
beginning to move into LFRic code
Event based computation P. A. Bogdan
moving from deep nns to spiking nns both have hardware accelerators
current applications outside of neuroscience are difficult
qr code videos around. Showed event based camera sensors i.e rather than frames just picks out changes in luminosity.
Deep learning and GPUS
Porting Telemac mascaret to open power ** J. Grasset**
TELEMAC is a traditional GFD model Fortran based MPI code to work on CPUS
this guys is looking to use GPUS instead of CPUS
identified a huge bottleneck
move to an openACC with MPI to run on GPUS - run much quicker also use hybrid open mp MPI to see a smaller improvement. Also saw an increase using the pig compiler vs the IBM - on Nvidia hardware
interesting comments on smt speedups not being as good as would be expected - seen not just here
demos supercomputer for material sciences M. Turchetto
the demos system was focused on GPUS focused cheap alternative to for example Intel Xeons - although more constrained on applications - but significantly cheaper
now have a hybrid supercomputer with Tesla GPUS
machine learning parallelisation Dr Torsten Hoefler
Nui et al 2011 and [Dean et al 2012]( using stale synchronous - adjusting weights according to how old the data is so modern data is weighted higher than old data
also shows examples of losing data convergence of sparsified methods nips 18
deep 500 - 500 ways to optimise your Neural Network with a github codebase
https://spcl.inf.ethz.ch/Publications/.pdf/deep500_ipdps19.pdf
Panel discussion
key emerging technique to track?
https://www.oerc.ox.ac.uk/projects/ops ops with caution - coding with no care for the architecture can never be done but something like this could be used works with CUDA
see https://op-dsl.github.io/ porting applications across parallel architectures - panel seemed to not super like this idea but perhaps for not so emerging technologies could be handy
noted that archer 2 will be lacking even GPUs - large resistance in porting code.
ensuring requisite skills to get uptake?
spoke about UKRI kpt push funding for increasing knowledge base across education and industry
Novel Software
Lattice Boltzmann method L. Ragta
using on multiple architectures - wants to future proof the software
HiLeMMS (High-Level Mesoscale Modelling System) using mesh solvers in domain-specific language based on ops library to run on different architectures
Pattern mining - S. Titarenko
used to clustering or classification analysis
wants to work in real time so can have limits on storage and compute time - needed optimising
use farpam for optimised constraints on pattern-mining
packs binary info into smaller amounts on memory and used binary logic
give an example of patterns in temperature change in European cities dropped the pattern mining from few hours to minutes with uncertainty constraints few hours dropped to few seconds (~6000 times faster!)
optimised by sorting first and storing only unique events to reduce length, and then transfer to binary vectors
Allows to further reduce the number of patterns required to check.
highlights the importance of compression and efficient multithreading and simplifying the problem.
WORKSHOPS
Deep learning workshop
Introduction
slides available (talk similar to AI day workshop in dec)
SPANNER
bio inspired neural networks - spike driven
modelled on synapses
use hardware advancements to create “homeostasis”
focused on fault tolerance and self-repairing
Astrocyte neural networks
Use FPGA systems (field programmable gate array)
Neuromorphic engineering - analogous to biological processes
looks at electrical characteristics of neurone and synapses
Spiking neural networks - input currents give output spikes
Biological overview Wade 2011 bidirectional coupling between astrocytes… (plos one 6.12)
Kinda cool works with currents and transmitting probability
showed an example of a fault-tolerant self-driving robot car
special hardware even alters clock speeds
spiNNaker
Spiking neural network platform
simulate 1% human brain using a 100kW machine (low power)
Has no shared memory - works with spinnaker packets instead and uses multicast routing
Like GPU’s requires a special set of software to work - convert python using spinnaker c and requires extracting of spikes back to arrays/data
Has a front end for non-neuroscience applications too e.g PyNN, graph - requires writing spinnaker applications
a Million cores available can make embarrassingly parallel code e.g. Lattice Boltzmann Method for parallelising stokes theorems
SpiNNaker 2 => 10 times scale-up in most aspects - aiming to the model whole brain!
Working in the future to Map Tensor Flow to spiNNaker
NVIDIA DIGITS class
Start off training Alex net NN - with only 100 epochs does pretty well at identifying the dog
Uses DIGITS running on caffe
Use large Kaggle datasets for training datasets - have to be “rich” enough to prevent overfitting
preparing data: must standardise images (e.g. all same size etc) and separate into classes
can specify a set of data for validation e.g. 75% for training and 25% validations
if you see the loss not changing in the validation but changing in your training data set that implies that you are at the edge of overfitting
SPANNER WORKSHOP
FPGA
configurable array of logic array
requires hardware description languages like verilog (similar to C) or VHDL
example a 4 bit counter in verilog with input and output ports
mostly closed source but can use free version https://www.edaplayground.com/register
use toolset simulator synopsis vcs and open epwave after run
whole set of links to run through various tools can be as simple as clicking RUN
example of unsupervised fault-tolerant models - with spiking nns
building a neuron in verilog with leaky inter-gate and fire method
Can see in the “waves” spikes appear with current thresholds being reached and adjust to get stable firing rates
in this example the current reduces and generates a fault but the firing rate increases to make up for this
Uses a moving average to calculate spike rates see links 6 and 7 showing difference in spikes with different input currents
The uses a transmission probability - approximates a Gaussian distribution to a step function via bandpass filter
Follows a Hebbian Learning framework.
Has to account for differences in spikes from inputs to outputs - uses correlations to adjust and the neurones can even have different learning rates. Uses bcm stdp rules for representing activation functions.
Weight adjustments are made via a synaptic calculator - like the loss function weight adjustments
example link 9 is specifically forcing the synaptic waves to never stop learning
can input multiple different patterns of spikes for pattern recognition in the example a robot picks up a pattern of spikes, for example, an object is identified and the pattern tells the robot to move left
Lastly reduction in hardware resources can be achieved by logical optimisation and partitioning
They show a raspberry pi robot that can follow a maze Www.york.ac.uk/robot-lab/spanner
spiNNaker
https://spinn-20.cs.man.ac.uk
Jhub tutorial.
using pyNN.spiNNaker
well written ipython notebook tutorials in the Manchester Jhub.
set up populations of neutrons and specify how to connect them with weights and delays (due to the spiking nature) - delays of up to 16 timesteps is recommended
can use static or plastic (time dependent) synapses
runs in ms rather than timesteps pyNN sends the code to spiNNaker and runs it all for you. Then you have to extract the data in a weird neo format