Pitfalls of ML at (not only) Kiwi.com – Roman Rožnik
No doubt machine learning is a hot topic in recent years, it seem’s everybody can easily become a data scientist and do ML within few lines of code. Reality is much harder. Understanding the problem, preparing right training data, cleaning them, designing features, interpretability/complexity of the model, defining right metrics, looking at false positives/negatives, interpretation of ML results or AB tests – those are topics highly tied with data science that are often overlooked and underrated. I’d like to emphasize that those are very important and ML itself is just one small piece of complex data science puzzle. Bringing data science and ML approaches to the crazy company like Kiwi.com is very hard and often frustrating costs lot of blood, toil, tears and sweat and brings disillusion, sadness and lot of fails.
Roman has always been the developer most interested in math and algorithms. With a stroke of luck, he became the machine learning guy at Seznam.cz where he introduced ML to the full-text search team. After he fulfilled his mission with Seznam.cz, he decided to bring the ML and data science approach to Kiwi.com.
Deploying text RNNs in bank – Vlado Boža
One part of the anti-money laundering process is a background check of the client to determine whether the client has been engaged in any risky or illegal activity. This was done mostly by searching manually news databases and reading returned articles. Vlado will present how Merlon Intelligence helps automate this process by implementing a solution using recurrent neural networks and also how it has coped with limited availability of training data.
Vlado Boža is a Lead ML engineer at CEAI where he focuses on hard problems requiring interesting solutions, especially in the area of NLP. He is a co-founder of Black Swan Rational, a boutique data science consultancy where he worked on projects like computer vision for detecting clouds in the sky, analysis of electrical smart-meter data or an analysis of M&A data. Vlado holds a PhD in Bioinformatics focusing on processing DNA sequencing data. He is a big fan of clever heuristics, probabilistic algorithms or any interesting efficient algorithms
Building Safe AI – Andrew Trask
We are experimenting with a new format of moderated live streamed sessions with top experts from the field, so don’t hesitate and come with friends to Impact Hub to support us, enjoy talk and rock the afterparty!
In the first half of this talk, I’ll introduce and describe Private Deep Learning, which is an approach to training neural networks in an encrypted state such that it’s growing intelligence (and the underlying data) is protected from theft. This will include a description of Federated Learning and Multi-Party Computation.
In the second half of this talk, I’ll be discussing the significant impacts this technology has when combined with the recent advancements in Blockchain and Peer-to-Peer into a new open-source platform called OpenMined. This will include a live demo showing how to train a neural network on a large, distributed, private dataset.
Andrew Trask is a PhD Student at the University of Oxford studying Deep Learning. He is also the author of Grokking Deep Learning, a Manning Publications introductory book which has sold over 6000 copies, an instructor in Udacity’s Deep Learning Nanodegree, and the author of a popular machine learning blog http://iamtrask.github.io. Previously, Andrew was a researcher and analytics product manager at Digital Reasoning where he trained the world’s largest artificial neural network with over 160 billion parameters and helped guide the analytics roadmap for the Synthesys AI platform deployed to many Enterprises such as Goldman Sachs, UBS, HCA (the largest hospital network in North America), various members of the Intelligence Community, and the US Military. Andrew lives on a boat in Oxford with his wife Amber and plays the piano in his spare time.
Why we need GANs for image manipulation – Michal Hradiš
Image processing certainly did not miss out on the big convolutional network revolution. Networks are at the core of state-of-the-art methods in image deblurring, superresolution, motion estimation, and even in such mundane tasks as image compression. Compared to more traditional approaches, networks can be trained for specific types of images, don’t require deep understanding of complex mathematics, and they can even hallucinate realistic image details.
In this talk, I will show you how efficient image processing networks can be built and trained. I will explain why the hell do we need Generative Adversarial Networks and how they relate to human perception. The presented ideas will be demonstrated on real world image enhancement applications. You’ll get a chance to experiment with them at home using provided TensorFlow code.
Towards General AI – Tomas Mikolov
I will present CommAI, a project aiming to build the first general AI with human-level communication skills. This includes a novel type of training setup which stresses the importance of unsupervised and incremental learning. I will describe the simplest learning problems defined in this environment, and various attempts to solve these problems using classic machine learning techniques.
Tomas Mikolov is a research scientist at Facebook AI Research since May 2014. Previously he has been the member of Google Brain team, where he developed and implemented efficient algorithms for computing distributed representations of words (word2vec project). Tomas has obtained his PhD from Brno University of Technology (Czech Republic) for his work on recurrent neural network based language models (RNNLM). His long-term research goal is to develop intelligent machines capable of learning and communication with people using natural language.
AI in the Office – Pavel Dvořák & Petr Mejzlík
How much time do you spend searching for the information during your workday? Have you ever been searching for some document, presentation or diagram with information necessary for creating your report, writing a documentation or designing a user interface according to company’s standards? Have you always found it immediately or have you had to go through several folders to finally shout “Eureka”? Or have you even sometimes forgotten that a particular document exists and discovered it after your work was finished? Konica Minolta Laboratory Europe’s vision is to develop an operating system for the workplace of the future, called Cognitive Hub, as a nexus for users’ information flows. With Cognitive Hub we aim at changing the busy workers into effective workers by letting them focus on creative work instead of doing tedious and boring tasks such as searching for information or people with the right skill sets. In this talk, we will introduce you our overall goal and the role of Machine Learning in it, with a special focus on Computer Vision area, and present what we have already done.
Pavel Dvořák works as a Computer Vision Research Specialist for Konica Minolta Laboratory Europe. In 2015, he obtained a PhD degree in Computer Science at Brno University of Technology with a thesis focused on the application of Computer Vision in Medical Imaging. During his doctoral studies, he also worked as a researcher at the Czech Academy of Sciences and spent altogether two years as a visiting researcher at several European universities, e.g. TU Munich, Medical University of Vienna or Vienna University of Technology.
Petr Mejzlík works as a Machine Learning Research Specialist for Konica Minolta Laboratory Europe. He obtained MSc. in Computer Science in 1987, and PhD (Dr.) in Molecular Biology and Genetics in 1994 from the Masaryk University in Brno. He had been teaching and doing computational chemistry/biology research at the Faculty of Informatics, Masaryk University until 2002. Since then, he participated on technology research and technical software development in Virtual Reality Simulators, ANF Data, FEI, Honeywell, and Kinalisoft.
0-day Malware Detection at Scale – Zdenek Letko
In this talk, we will introduce the key aspects of the global infrastructure built by Wandera to offer organizations a global solution for Enterprise Mobile Security and Data Management. Next we discuss the exciting journey towards Machine Learning (ML) based zero day malware detection in a production environment. We will talk about the usual ML steps such as data harvesting, feature extraction, classification algorithm optimisation, model training, and evaluation. Having a functioning ML model is great but how to use it in production? The remainder of this talk is devoted to answering this question and will focus on model retraining, deployment, monitoring, and solution maintenance. And since we are not super heroes, the talk will be interlaced with lessons learnt – usually discovered the hard way. 😉
Zdenek is a software engineer and data science/machine learning enthusiast. He is currently working for Wandera, helping MI:RIAM to see, understand, and predict Internet traffic and applications behaviour.
FlowerChecker: Exciting journey of one ML startup – Ondra Veselý & Jiří Řihák
FlowerChecker — machine learning startup — was established three years ago by three PhD. students with one goal: plant identification.
The story-like talk shows how we use machine learning to validate the initial business idea. How we struggled trying to use existing image-recognition software and also and how and why we have collected dataset for the first commercial machine learning system with different interfaces: mobile app, facebook chatbot or twitter guerrilla marketing bot. Many colorful graphs included.
The second part of the talk goes more technical: TensorFlow, Inception v3, data preprocessing tricks, performance tuning and debugging. Basically all the struggles we needed to overcome to be able to identify 9000 different plant species.
Ondřej is a developer and data engineer. After a brief experience with development for Seznam.cz and AVG Technologies, he founded FlowerChecker where he plays CEO role. After stabilising the business, he joined Kiwi.com on its early startup-stage to establish analytics, research teams. Currently he builds streaming pipelines for business intelligence, leads Czechitas python courses and consults R&D projects for the European Commission.
Jirka is FlowerChecker co-founder responsible for app development and ML in plant identification. He is also finishing his PhD in Adaptive Learning group – small, but enthusiastic research lab at FI MU focused on application of ML in education.
Image Search @ Seznam.cz – Lukáš Vrábel
Seznam.cz is, among other things, a major player on the search engine market in the Czech Republic. A few years ago, we switched our image search service from a third-party provider to our own in-house solution. It started as a simple modification of our fulltext search engine. But, over time, it has gradually developed into a standalone system.
This talk will be focused on the evolution of the search, the obstacles we faced and the solutions we implemented. We’ll briefly discuss the models, machine learning techniques and features that are used in the image search pipeline. The focus will mostly be on our investigation into deep learning in order to further improve the relevancy of the system.
Lukáš has industry experience with various machine learning tasks ranging from NLP through web page analysis to image recognition. Formerly a head of research department at Seznam.cz, he is currently solving the world’s largest industry problems using AI and machine learning at CEAI.
Yes, we can improve it with Evolutionary Computation! – Lukáš Sekanina
In this talk, I will survey main principles and branches of EC and show typical applications of EC. In particular, EC will be presented in connection with approximate computing – a recent approach introduced for developing faster, more energy efficient, and less complex computer systems in which the correctness requirement can be relaxed to some extent.
Case studies will be focused on evolutionary approximation of digital circuits that are crucial for low power image processing, deep neural networks and other systems on a chip.
Lukáš (co)authored over 150 papers mainly on genetic programming, approximate computing, applications of bio-inspired AI and digital circuit design. His research results were awarded with one Golden and two Silver medals at the Humies competition annually organized at GECCO.
Face detection and verification – Marián Beszédeš
Time Series Predictions using Neural Networks – Rudradeb Mitra
One of the key application of time series data in AI is Predictive Analytics. Companies are using predictive analytics in various ways – from predicting customer buying behavior to predicting heath risk or predicting the future breakdown of trucks.
In this talk, the speaker will explain about the concepts of time series database and predictive analytics. Then he will show through examples on how NN is used to make time series predictions.
After finishing his Masters from Univ. of Cambridge he went on to build 4 startups – two in Silicon Valley, one in UK and one in Belgium. These days his focus is on applications of AI and IoT. In his free time, he writes and talks about to Artificial Intelligence, IoT and startups.
Speech data mining: not yet ready for retirement – Honza Černocký
Bradford Cross – How is Machine Learning used in Fintech? Vertical AI and its Applications in the Financial Services Industry
We will start by mapping the financial services sector into banking, insurance, investments, real estate and consumer financial services and looking at machine learning applications in each.
Then we will contrast modern machine learning approaches against traditional ‘quant finance.’
Finally, we will dive into specific applications like underwriting credit and insurance, risk scoring, product and marketing, financial crimes, real estate, and some more exotic ideas like using satellite imagery to make economic predictions on earth.