LT-Accelerate is the premier European conference focusing on building value through language technology. It links text, speech, social and big data analysis technologies to a spectrum of corporate and public sector applications.
UNICOM is the Networking Partner of this conference and OptiRisk is the Knowledge Partner and presenting “Exploiting market sentiment to create daily trading signals”.
LT-Accelerate is for business analysts, technologists, consultancies and executives, a venue for learning, networking and connecting.
The key to a great conference is great presentations.
Topics to be covered:
Applied technology focus
‘Please book through UNICOM for a discount of 5% LTUNICOM’
Seth Grimes, Alta Plana Corporation
Maria Eskevich, Radboud University, Nijmegen, the Netherlands
FutureTDM is an action funded by the European Commission that aims to enable and encourage the uptake of text and data mining (TDM) in Europe. We involve stakeholders into discussions about their TDM experience; and analyse existing studies, policies, and economic and scientific trends that highlight the usefulness and potential of TDM across all economic sectors. We inform the community at large about our findings and create a knowledge sharing platform for information and experiences, to support and inspire innovation growth and technology interoperability.
Mike Hyde, Skype
Matthew Honnibal, spaCy
Lev Konstantinovskiy, RaRe Consulting
I will tell a story of a real industrial project with noisy/missing data, partially defined problem, warts and all. The project was about turning 70 formats of truck routing emails into a structured table. This table became a real asset to the client and could be analysed by their data science department. Our single deep learning model could replace a previous complicated and fragile regular expression solution. I will also describe how our open-source NLP package Gensim has provided business value to clients in other contexts. In particular how to give a birds eye view of the entire corporate text collection using Topic Modelling.
Yves Peirsman, NLP Town
In recent years, Natural Language Processing seems to have become somewhat of a commodity. Cloud APIs such as TextRazor, Aylien and AlchemyAPI offer off-the-shelf NLP solutions such as entity recognition and sentiment analysis in a variety of languages. NLP software packages such as SpaCy (Python) and Stanford CoreNLP (Java) allow developers to integrate various NLP tasks directly into their own software, with or without modification. Libraries such as scikit-learn (for a wide variety of machine learning models) and TensorFlow (for neural networks) significantly lower the threshold for developers to enter into machine learning and build their own NLP models. In this fragmented landscape, it can be hard to see the forest for the trees. In this presentation, I will explore the continuum between custom and off-the-shelf NLP. I will review the landscape of NLP APIs and libraries and evaluate if these tools really deliver on their promises. Finally, I will give recommendations about how to pick and choose from the available options and how to build a high-performing NLP solution.
Pedro Dias Cardoso, Synthesio
Preriit Souda, TNS
Mara Tsoumari, SentiGeek
Discover the business value in customer reviews that is built on the features, tech specs, or other product items discussed in a comment/review. SentiGeek will also highlight the new perspective offered in customer persona profiling by the combination of reviewers’ profiles and business insights based on aspects of brands and products. Finally, SentiGeek will shed light into prediction based on text analytics data.
Steve Dodd, Socialgist
Charles Huot, GFII
The GFII is the French Association gathering companies from the various sectors of the professional information industry, including Scientific and legal publishers, press companies, information producers, information brokers, information users, dedicated software and service companies. Variety is one of the major aspects of Big Data: high volumes of information are available in heterogeneous formats, in an expanding number of languages, on several several types of media... In this context, decision making, at any level, requires to be able to capture and analyze varied information and render the results in usable forms. The panel will present use cases illustrating decision making from differents types of information sources
Ariane Nabeth, Vecsys
Jocelyn Bernard, ReportLinker
We developed a search algorithm evaluating the affiliation probability from an individual to a specific group." => the probability of an individual to be a member of a specific group". This algorithm has been developed observing a group of kangaroo. We are serious, aren't we? => aren't we? We use the algorithm to understand which industry a company is evolving, "=> in which industry a company operates", who are their competitors and new entrants. We will explain how?
François-Régis Chaumartin, Proxem
Michalis Michael, DigitalMR
This presentation will briefly describe social listening 1.0 then 2.0 and finally focus on 3rd generation platforms, very few of which are being launched this year. Gen 1: sentiment accuracy of less than 60% and usually only capable of dealing with one language. There was no capability of drilling down into topics, sub-topics and attributes, when they provide sentiment, it is usually at the brand level or search term level; the noise that a user query returns is 80%-90%. Gen 2: a platform that was specifically developed for consumer insights, one that addresses all the shortcomings of 1st generation tools mentioned above. Gen 3: they link sentiment to customer behaviour and customer profiles, they're purposed to focus more on analysing images, voice and video for sentiment and more granular emotions. They use text analytics to deal with sources of unstructured text other than social listening e.g. email databases, instant messaging, call centre conversations and integrate them to synthesize unique insights. Delegates will learn about: a unique way of doing emotion analytics on social media; the commercial use cases for emotion analytics; how to integrate emotion tracking with other data sources; the latest on image analytics and how text analytics can be enhanced.
Philippe Wacker, LT-Innovate
Susanne Weber, BBC
Peggy van der Kreeft, Deutsche Welle - Matt Simpson, Ericsson - Alexandra Birch, University of Edinburgh - David Imseng, Recapp - Pia Virtanen, YLE
Joachim Koehler, Fraunhofer
This presenation shows the productive Fraunhofer Audio Mining solution used by the largest German broadcaster, WDR, to index and archive huge amounts of broadcast data. The presentation shows the technology based on Deep Neural Networks, trainings procedures using very large speech corpora and the final solution integrated in a productive broadcast environment. Further, the evaluation and the results of the usage of media experts are reported.
Jean-Francois Damais, Ipsos
Lana Novikova, Heartbeat & Odile Jagsch, Kantar TNS
Caroline Brun, Xerox Research Centre Europe
The World Wide Web has become a global forum where people share their feelings about almost everything. Social media networks spread vast amounts of user-generated content, where millions of people's opinions are openly accessible. This content is of great value to policy makers, social scientists, and businesses. Due to the sheer quantity of information and the diversity of comments, managing brand reputation and customer relations will increasingly rely on technology that can automatically and reliably detect not simply binary opinions, but also, more subtle, nuanced sentiments and mixed feelings. While most of the work in sentiment analysis has primarily focused on classifying the overall polarity of user-generated documents, opinions or sentiments are usually not one-dimensional but multi-dimensional. People often care differently about different characteristics and features of products and services. There is a real need for organizations to understand what these specific sentiments are so they can pinpoint and prioritize what to do e.g. change features, improve service or communicate differently. Aspect Based Sentiment Analysis (ABSA) is precisely about mining text and summarizing the opinions expressed to ascertain the attitude of a speaker or writer toward specific entities (things) and their aspects. The Xerox Research Centre Europe is exploring this challenging topic using a combination of advanced natural language analysis with machine learning algorithms. We will briefly describe the topic, related challenges and share our work on multilingual ABSA, including our participation to the SemEval 2016 international Challenge in ABSA. We are currently looking to pilot the system with partners.
The domain of healthcare can be viewed as a big, text generating enterprise: Clinical notes, findings, reports, protocols and letters are being created incessantly. More often than not, this is largely based on structured data, but still a human has to sit in front of a microphone and dictate. The resulting text is often "mined" for information that was present in the structured data already. The talk focusses on alternatives to this odd state of affairs. Employing Natural Language Generation to create medical documents from structured data seems not too complicated, but it meets a lot of difficulties which, for instance, stock exchange reports or wheather forecasts, are not faced with. Nevertheless, the problem should not be insolvable. A short demonstration highlights that such NLG systems can be built if the task is well understood.
Christophe Bourguignat, Zelros
Chatbots and conversational interfaces are the new big thing : they are intuitive, natural, and can be quickly deployed without downloading a new app. Use cases for enterprises are numerous : agenda management, customer support, ... During this talk, we will focus on one particular B2B application : how intelligent assistants can reinvent the way analysts and data scientists access and spread advanced data analytics across organizations. We will describe a few scenarios and technological challenges like NLP and NLU in this type of context
Jose Quesada, Letka AI
Several market studies are estimating an impressive growth of different areas and applications of language technology focused on the improvement of man-machine voice-based interfaces. Even if is difficult to draw clear technological or functional borders between Spoken Dialogue Systems, Intelligent Virtual Assistants, ChatBots, or Conversational Interfaces, all these approaches share the demand of high-quality Speech Recognition, Natural Language Understanding, Dialogue Management, Language Generation and Speech Synthesis. However, and despite decades of research and development, building such systems is still a major challenge for the accademy and the industry. The statistical approach (based on different machine learning techniques) is substituting progressively techniques based on finite-states, frame-based schemes or in general the hand-crafted dominant model for building dialogue systems for industrial applications. Partially Observable Markov decision problem (POMDP), Reinforcement Learning and more recently Deep Learning can be named among the key techniques that have proved a breakthough in this field. In this talk I will present the key motivations, design constraints and functional goals being applied on the Fluency project. Fluency is a framework currently under development by the Lekta company. Our main goal is to build a hybrid architecture integrating the most advanced research achievements as well as ensuring critical business requirements such as dialogue control and optimization, business intelligence integration, reliability and scalability.
Matthias Heyn, SDL (pending)
Xiang Yu, OptiRisk
We have created an innovative and dynamic trading strategy for equities, with a particular focus on controlling downside risk. The mathematical concept behind the approach is called stochastic dominance, where investment decisions are based on distributions rather than moments. A major contribution of news sentiment is in the prediction of future distributions. Regression analysis on news sentiment and regime switching models are employed to digest market moods and account for changing market situations.
Marcus Hassler, EconoB
Social Media channels provide a valuable and rich source of business information relevant to Fintechs, especially the capital management. After integration and management of the high data loads, the challenge is real-time evaluation of that information regarding specific insights and impacts. Through solving that difficulty a wide range of opportunities for Fintechs including research, trading or asset allocation arise. With TWIction Finance (http://twiction.lingrep.com) econob presents a platform that enables domain-dependent live Twitter monitoring of statements relevant to the capital management. In this talk the focus is on measuring and combining the mood in the capital management domain with concrete evaluation and classification of topics like “tax”, “fraud”, “buy/sell”, “trading”, and “global economy”. A key factor to mood detection within small and often contextless textual fragments is a highly specialized linguistic assessment by the identification of simple and complex linguistic patterns. For capital management detecting the mood and opinion is only half of the bill, because in the field of Big Data the noise distorts the accuracy so that a multi-dimensional analysis by a combination of classification and concepts becomes essential. As a result the financial expert is capable to make opinions on issues relevant to finance on millions of tweets by the combination of concepts, mood and classification e.g. VW is downgraded from hold to sell. This presentation provides the building blocks that enable this relevance filtered and concept related sentiment assignments of financial classification. Further, application areas and future directions of this kind of tools are depicted.