Denver, Colorado -- November 17th, 2017
- Workshop Location: Colorado Convention Center, 406-407
- Date: Friday (11/17/2017)
- Time: 8:30AM - 12:00PM
- 8:35AM: Keynote
- Title: Cloud based systems and challenges for data rich research workloads
- Speaker: Vani Mandava (Director of Data Science, Microsoft Research)
||Vani Mandava is a Director of Data
Science at Microsoft Research at Redmond with over a decade of experience
designing and shipping software that in use by millions of users across the
world. She is passionate about enabling academic researchers and institutions
develop technologies that fuel data-intensive scientific research using
advanced techniques in data management, data mining, especially leveraging
Microsoft cloud and AI platform.
She has enabled the adoption of
data mining best practices in various v1 products across Microsoft
client, server and services in Office, Sharepoint and Online Services (Bing
Ads) organizations, co-authored a book ‘Developing Solutions with Infopath’,
and hold patents in online services architecture. She co-chaired KDD Cup 2013
and partners with many academic and government agencies including NSF funded
Big Data Innovation hub effort, a consortium coordinated by top US data
scientists and expected to advance data-driven innovation nationally in the US.
- Abstract: In this talk we show a glimpse of
typical research topologies across hundreds of research projects over the past
few years that leveraged platform services and data systems in the cloud. We
then look at a few case studies that dive into specific techniques that
researchers have used to distribute and efficiently process different types of
data using diverse data architecture and storage mechanisms, in domains such as
genomics, smart cities, healthcare, and education among others. Finally, we
explore the hard challenges in this space, and some recent and upcoming
- 9:35AM: Paper presentation
- 9:35AM - 10:05AM: Brandon Posey, Christopher Gropp, Alexander Herzog and Amy Apon. Automated Cluster Provisioning And Workflow Management for Parallel Scientific Applications in the Cloud.
- 10:25AM: Paper Presentations
- 10:25AM - 10:55AM: Gourav Rattihalli, Pankaj Saha, Madhusudhan Govindaraju and Devesh Tiwari. Two stage cluster for resource optimization with Apache Mesos.
- 10:55AM - 11:25AM: Pankaj Saha, Angel Beltre and Madhusudhan Govindaraju. Scylla: A Mesos Framework for Container Based MPI Jobs.
- 11:25AM - 11:55AM: Anthony Kougkas, Hariharan Devarajan and Xian-He Sun. Syndesis: Mapping Objects to Files for a Unified Data Access System.
- 11:55AM - 12:00PM: Concluding Remarks