PAPIs.io '14 schedule

Monday 17th November 2014

  • BigML

    by Poul E. J. Petersen

    In this session you will see how BigML makes machine learning more
    accessible than ever thanks to it's well defined workflow, insightful
    visualizations, and fully featured REST API. Concepts demonstrated
    will include predictive analytics with decision trees, how to solve
    over-fitting with ensembles, how to evaluate a predictive model, how
    to find patterns with clustering and how to detect anomalies.

    Don't miss the opportunity to learn first-hand how to easily create
    powerful predictive applications with BigML.

    At 9:00am to 9:45am, Monday 17th November

    Coverage video

  • Dataiku

    by Florian Douetteau

    At 9:45am to 10:30am, Monday 17th November

  • Indico

    by Slater Victoroff

    Indico is a Boston-based company working to democratize machine learning. We are building intelligent tools for smart data science, and hope to blur the lines between developer and data scientist: making data science and machine learning truly ubiquitous.

    Need text and image analytics? Come learn how to use Indico’s suite of machine learning tools — from sentiment detection to image similarity to political analysis. We’ve got wrappers for most major languages (Python, R, Node, Ruby, Java, Objective-C, and PHP). The tutorial will include an explanation of how to build a robust image-search functionality.

    At 10:30am to 11:15am, Monday 17th November

    Coverage slide deck

  • Intuitics: Building predictive web apps on top of R with drag and drop

    by Christian Mladenov

    This session will demonstrate how starting with a linear R script, a data scientist can create an interactive web application. The script is first split into stateless functions that do things like obtaining data, cleaning it, filtering it, transforming it, running descriptive and predictive algorithms, and generating plots. A graphical interface will then be created by dragging and dropping widgets (components like text boxes, drop-downs, tables, images). The widgets are configured to link the underlying R functions into a workflow. The resulting interactive application is immediately put into production, without any IT involvement. Computing resources are dynamically allocated for each user that runs the applications. The use of the application is then demonstrated, including sharing the state with other users.

    At 11:45am to 12:30pm, Monday 17th November

    Coverage slide deck

  • Yhat for Rapid Deployment of Predictive Models: A Beer Recommender

    by Greg Lamp

    Building the predictive aspect of applications is the fun, sexy part. New tools like scikit-learn, pandas, and R have made building models less painful, but deploying/embedding models into production applications is challenging. We'll show how Yhat makes deploying predictive models written in Python or R fast and easy by building a beer recommendation system and an accompanying webapp.

    At 12:30pm to 1:15pm, Monday 17th November

  • Developing Extensions for RapidMiner ...rapidly

    by Sabrina Kirstein

    The open-source based predictive analytics solution RapidMiner offers an API that lets users extend its functionality easily and integrate it into predictive applications. As a result, many research organizations, universities, and companies have built their work on this platform and extended its applicability to new domains. This presentation highlights the top extensions found on the RapidMiner Marketplace and how developers can use the RapidMiner API to create their own extensions as well as to integrate RapidMiner into their solutions.

    At 1:15pm to 2:00pm, Monday 17th November

    Coverage slide deck

  • GraphLab

    by Danny Bickson

    At 3:00pm to 3:45pm, Monday 17th November

  • PredictionIO

    by Thomas Stone

    PredictionIO is an open source machine learning server for software developers to create predictive features. Traditionally, this included personalization, recommendation and content discovery in domains such as e-commerce and media. The latest version of PredictionIO will open our platform for many more use cases such as churn analysis, trend detection and more! Allowing developers to use the power of machine learning for any web and mobile app. We will also discuss the new software design pattern DASE for building machine learning engines on top of PredictionIO's scalable infrastructure. It's time to see what an open source community can build re-imaging software with machine learning.

    At 3:45pm to 4:30pm, Monday 17th November

    Coverage slide deck

  • APItools demo: integrating predictive APIs

    by Manfred Bortenschlager

    The data that goes into and comes out of Predictive APIs are crucial. Integrating these APIs and staying on top of the quality of data can be a challenge: due to the increase of 3rd party web APIs it is usually a lot of effort to manage the various integrations with the various 3rd parties. This talk will demonstrate a proxy-based approach where data can be transformed, harmonized, or customized between the API provider and before they hit the app. For this I will use APItools which is a free and open-source set of tools designed specifically to solve the integration pain for developers.

    At 5:00pm to 5:15pm, Monday 17th November

    Coverage slide deck

  • Predicting at the command line

    by Jeroen Janssens

    Jeroen Janssens is the author of Data Science at the Command Line. In this exclusive tutorial, he will show us how to use predictive APIs to make predictions from the command line.

    At 5:15pm to 6:00pm, Monday 17th November

    Coverage slide deck

Tuesday 18th November 2014

  • Keynote — Building Intelligent APIs

    by Andy Thurai

    The birth of a sophisticated Internet of Things has catapulted hybrid data collection, which mixes structured and unstructured data, to new heights. The goal with any analytics software is to find and improve better data sets rather than spending time in identifying, prepping, cleaning, and preparing the data. Not only is predicting and prescribing an action anticipating a future issue desired, but if the action is ignored then a forward thinking automatic adoption should suggest an advanced course correction based on previous action items not acted upon. Predictive analytics algorithms should recalibrate themselves. As the incoming data evolves, so do the algorithms – they must re-fit, re-predict and re-prescribe.

    Listen to Andy Thurai, Program Director at IBM (API, IoT and Connected Cloud), talk about how the time has come for machines and humans to work together to make each other smarter. The combination of APIs, IoTs, big data, smarter analytics, and cognitive computing is transforming the way we see the future — and more importantly, what we do about it.

    At 9:00am to 9:45am, Tuesday 18th November

    Coverage slide deck

  • Showcase — A predictive API to enhance waste recycling

    by Alexandre Vallette

    Recycling centers where designed 40 years a ago and have now trouble managing the huge demand they are facing. Cars often queue for hours and people often find container which is already full. The result is a huge amount of waste buried or burned where it could have been recycled.

    A contextual predictive model was developed in order to provide the citizens with the information: what is the best moment to go to which recycling center in terms of waiting time and bin availability ?

    This predictive model depends on sensors deployed in each recycling centers and various open data sources.

    The API here is the web that links avery parts:
    - the sensors to push new version of the software and source the measurements in real time
    - the predictive models is fed with fresh measurements and fresh data
    - the web/mobile app with the predictions
    - the users demands are crowdsourced to a server
    - the BI tools of the waste managements authorities

    We hope to demonstrate how a predictive API like ours can solve real life problems.

    At 9:45am to 10:10am, Tuesday 18th November

  • Showcase — Predicting and balancing load in city-wide bikeshare schemes

    by Raphaël Cherrier

    Bikeshare schemes are present in more than 700 cities and they're expanding rapidly (more than 200 cities are currently building such schemes). They allow people to move freely around town by bike. Using 4 years of data consisting of snapshots of the Bordeaux bike network taken every minute and of detailed weather data, we are able to predict load up to 12 hours in advance. Predictions are made available to end users through an API which is used by the popular Bordeaux bikes mobile app. Predictions are constantly updated and they can help users plan their trips, but they can also help operators anticipate bike shortage at each station (or on the opposite bike affluence) and thus optimize load balancing strategies.

    At 10:10am to 10:35am, Tuesday 18th November

    Coverage slide deck

  • Showcase — Predictive applications built on Microsoft Office 365 service health API

    by Frank S. Zhang

    Office 365 Health API enables developers to build predictive mobile and web applications to monitor application and service health. Developers can use the data streams from the API to provide custom information and quick actions for our customers, partners, and internal IT stakeholders. The API is powered by the Office 365 Health Engine that performs data curation and analytics in real time using statistical and machine learning models on top of multiple signals.

    At 10:35am to 11:00am, Tuesday 18th November

    Coverage slide deck

  • Showcase — Analyzing banking data to provide relevant offers to consumers

    by Marc Torrens

    Banks are increasingly facing competition from the giant Internet companies in the financial sector. To meet this challenge they must take advantage of their unique position: they have been collecting financial behavioral data for decades. At Strands we have implemented a platform that channels relevant commercial offers from merchants to consumers within financial institutions. The relevancy of the offers is optimized by predicting the likelihood of consumers to buy products within a given industry or merchant. In this talk, we'll share some of our methods and we’ll show how merchants can create targeted audiences for their campaigns without having to deal with the complexity of predictive modeling.

    At 11:00am to 11:25am, Tuesday 18th November

  • From R&D to production-ready predictive apps: architecture challenges in large organizations

    by Yann BARRAUD and Christophe Bourguignat

    In this session, we will discuss the different technical options that have been considered in our lab to build predictive apps within a large organization. They rely either on open source or commercial products. We will provide some feedback on ongoing experimentations and thoughts on deployment strategies, types of platforms and frameworks.

    At 11:50am to 12:15pm, Tuesday 18th November

    Coverage slide deck

  • Real-Time bidding optimization: a behind-the-scenes look at Datacratic's predictive API

    by Nicolas Kruchten, Eng.

    Real-time bidding, in the context of digital marketing, refers to the purchase of advertising impressions one at a time, responding to tens of thousands of messages per second, paying a different price for each via an auction mechanism. This talk will cover in detail how Datacratic’s RTB Optimizer Prediction API predicts the outcome of buying a given impression, then computes the economic value of that outcome to produce optimal bidding behaviour.

    At 12:15pm to 12:40pm, Tuesday 18th November

  • Overcoming challenges in understanding text automatically

    by Elena Álvarez Mellado

    Human sentiments are broad and diverse. Products, brands and people can make us feel a wide range of emotions: happiness, anger, disappointment… Users and consumers express these feelings and opinions on Twitter, Facebook, blogs and comments, within the reach of companies and organizations. Understanding language is the key to learn what the community thinks about a particular product and to make predictions about it. Mastering language, however, is tricky: the diversity of vocabulary, the differences between regions and the use of irony and metaphors makes of sentiment analysis a complex and fascinating task.

    In this session we will see a practical case on how sentiment analysis allows detecting opinions, the challenges that understanding language automatically arises and the lessons learned when working on languages that are spoken on different countries, like English, Spanish or French.

    At 12:40pm to 1:05pm, Tuesday 18th November

  • Building a new predictive model & API in 30 minutes

    by Claudiu Barbura and David Talby

    Real, production-grade big data apps require integration between data ingestion components, Beyond Hadoop technologies, data science & modeling tools, publishing data to low-latency, high-throughput REST API’s backed by SQL or NoSQL stores, a visualization / application layer – as well as monitoring, instrumentation, security and administration tools. Adding intelligence to such an application also requires a broad set of machine learning algorithms, the ability to train & measure experiments in parallel, the ability to publish models as front-end low-latency robust API’s, and having online measurements in place. All this results in man-months of work on an average project to glue these pieces together into a production system. In this talk we’ll describe some of the main gaps we’ve had to address building dozens of such systems over the years, the design patterns & reference architecture we’ve come to adopt, and some handy tools to automate common tasks. We’ll demonstrate the challenges by building an end-to-end scalable, intelligent app during the session.

    At 1:05pm to 1:40pm, Tuesday 18th November

  • Ubiquitous Small Learning: enabling non-specialist developers to use machine learning in unexpected ways

    by Keiran Thompson

    Big Data gets all the headlines but the larger shift will come from machine learning becoming ubiquitous. API services like Datagami enable non-specialist web developers to use machine learning in unexpected ways. We will discuss a particular example of house price modelling as a component of a larger real estate website.

    At 1:40pm to 2:00pm, Tuesday 18th November

  • Predictive APIs and Azure ML: from Data Science Economy to Deep Net Topologies

    by Misha Bilenko

    At 3:00pm to 3:45pm, Tuesday 18th November

    Coverage slide deck

  • Tools in action: Dataiku — Winning Kaggle's Yandex personalised web search challenge

    by Florian Douetteau

    At the end of 2013, Yandex organised a Learning-to-Rank competition on Kaggle. Dataiku offered to sponsor a team of 4 people (2 data scientists, one product manager, and one developer) for the contest. Our team won the first prize. This talk will provide insights on how we did it as a team:

    • Collaborative work is great especially if one part of the team can work on feature engineering and the other part on the models;
    • Cross validation is key and really knowing the metrics you want and need to optimise is important
    • Random forest can do the job, gradient boosting can win the job.

    At 3:45pm to 4:05pm, Tuesday 18th November

  • Tools in action: Graphlab — Getting productive with Predictive Applications

    by Danny Bickson and Shawn Scully

    One of the most exciting areas in Big Data is the development of new predictive applications; apps used to drive product recommendations, predict machine failures, forecast airfare, social match-make, identify fraud, predict disease outbreaks, and repurpose pharmaceuticals. These applications output real-time predictions and recommendations in response to user and machine input to directly derive business value and create cool experiences. These hold the true promise of Big Data.

    The most interesting apps utilize multiple types of data (tables, graphs, text, & images) in a creative way. Typically, these are developed using data that’s larger than single machine memory, but smaller than the Pb’s some companies brag about housing. This “Medium Data” regime of >5Gb and <10Tb is where data science magic happens. In this talk, we’ll share the trends we’re seeing at Graphlab in predictive application development, show how to build and deploy a predictive app that exploits the power of combining different data types and representations (like graphs and tables), and through customer case studies share some key lessons data scientists and developers should like to hear.

    At 4:05pm to 4:25pm, Tuesday 18th November

  • Tools in action: Datagami — Forecasting Bitcoin exchange rates

    by Keiran Thompson

    The first user of the Datagami API built an app to predict Bitcoin prices. We will demo the app and discuss the architecture of the underlying prediction engine.

    At 4:25pm to 4:45pm, Tuesday 18th November

    Coverage slide deck

  • Lightning Talks

    At 5:15pm to 5:30pm, Tuesday 18th November

  • Panel discussion — Future of Predictive APIs

    by David Gerster, Alexandre Vallette, Danny Bickson, Andy Thurai, Keiran Thompson and Misha Bilenko

    At 5:30pm to 6:15pm, Tuesday 18th November

Schedule incomplete?

Add a new session

Filter by Day

Filter by coverage