Kolloq. Stephan Grell, topic: Bing – Looking at User Data and Processing Tools for Bing Relevance Engineering
Wednesday, 25th of January 2017, 2:15 pm Interims HS 2 (Campus Garching)
Since its launch in 2009 Bing has become the second largest search engine in the world and serves users worldwide in 40 languages. Within this time its market share in the US has grown to more than 20% and with that the amount of user data we get. The first part of this talk will look at how we use the Bing user data to train the Bing AI system to work across lots of countries and how we use the data to understand the users search needs. The user data is key in defining our AI training metrics (online / offline) as well as generating our training data. I will discuss the challenges we face with data quality and which methods we apply to make the data useful for training.
The second part will look at the tools we use to manage the data, enable its processing and allow our scientists and developers to collaborate on the same data. This part will also highlight the challenges we face when thousands of scientists are working off the same data to build AI systems for different parts of the Bing stack, how to manage knowledge and data sharing between teams, and most importantly, how to clean it up.
Stephan Grell is a Senior Developer at Microsoft, Munich and has been with Microsoft for the last 10 years. He started out at MSR and joined Bing in 2009. In his time at Bing, he has been working and leading different projects around: Web Discovery, Index selection, feature engineering & ranker training for Web Relevance, Relevance Tools and shipping process engineering. Throughout his entire time with Bing he has been a VC admin of Cosmos and setup policies on how the existing compute/storage resources can be leveraged. Before his time at Microsoft, he worked on "work load distribution/scheduling" with Sun Microsystems and at IBM on knowledge discover & mining.