Your current filters are…
by John Taylor
Data Science promises to transform ubiquitous and cheap data into insights with the potential for great social, scientific and personal value. However, while many of us have the needed hacking skills and domain knowledge, we might not have a strong background in the agglomeration of formal disciplines that underpin data science methods.
I will provide a lightning tour of high level theory, concepts, and tools to extract knowledge and value from data. These are deep and wide subjects, so emphasis will be placed on the high level structures of data analysis problems that point to good solutions.
by Brian Aker
Ever wondered what would happen if you could rethink a decade worth of design changes? Drizzle is a redesign of the MySQL server targeted at web development and optimized for Cloud applications. Update yourself on the latest features, and use cases for Drizzle7 and what is in store for the near future.
In the talk we will delve into Drizzle's multi-master support features, replication, and why no group commit is required.
We'll also explore forecast for Drizzle, a database ? Brian provides an overview of the Drizzle project’s current state as well as what’s ahead.
In 2009 I inherited a poorly documented tsunami evacuation simulator and used it to study the impact of proposed city planning on tsunami survival in Longbeach, WA. Half of the project was to learn how to use the pile of hacked-together proprietary software and script I was handed. The second half was to baby-sit a Windows machine on which the simulator would reliably take extended naps.
In 2011 we expanded the study into a new community. Facing new data formats, bit rot, mysterious crashes, expired software, and imminent project failure we re-built the simulator using open-source technologies in two weeks. In addition to simply completing the project, this change has liberated us from specific operating systems and per-seat software licensing.
Outcome? Faster results, less expensive setup, and a transition from a per-simulation pricing model to one focused on finding the best possible result.
Come learn how the time we spent prying ourselves open (and free) allowed us to deliver a better analysis in less time, guarantee future users less pain, and change the dominant paradigm from proving preconceptions to exploring alternatives. I will even run an exploratory simulation live, so we can all experience the power of asking good questions.
by VM Brasseur
I say "marketing." You probably hear "advertising." Perhaps (likely) you cringe a bit. Or perhaps your mind replays snippets of "Mad Men":http://www.amctv.com/originals/madmen/, all cigarettes and scotch. It's very doubtful you think "Open Source Project" but you probably should.
Marketing is so much more than just data mining and advertising (targeted or not). Marketing is discovery and identity and community outreach. Marketing can help your project help others. Marketing can help you turn your users into contributors and your contributors into evangelists. Yeah, *now* you're thinking "Open Source Project," aren't you?
Come join me as I dispel some of the clouds of pollution which obscure the name of marketing, show how it can help your projects, reveal how--whether you realize it or not--you already use marketing every day and how that's a very good thing indeed.
Urban Airship started using MongoDB in early 2010 and began migrating data off of it almost exactly a year later.
This talk will detail the wins Urban Airship gained with MongoDB as well as the shortcomings that eventually led to other databases being preferred. The goal is not to encourage or discourage anyone considering MongoDB but rather to share one group's experience with the database while supporting an ever growing amount of data.
For those who saw the first iteration of this talk at Update Portland, the slides will be updated, improved, and new data on sharding, replica sets, etc. will be added.
Stephen Covey’s 1989 book The Seven Habits of Highly Effective People “presents an approach to being effective in attaining goals by aligning oneself to what he calls ‘true north’ principles of a character ethic that he presents as universal and timeless.”
Another character ethic that is universal and timeless is that of the “troll”. The practice of disrupting communication using trolling techniques is probably as old as communication itself. We’ve all done it, but some of us have gotten good at it, and regularly “attain our goals”.
"Ganeti":http://code.google.com/p/ganeti/ is a robust cluster virtualization management software tool. It’s built on top of existing virtualization technologies such as "Xen":http://www.xen.org/ and "KVM":http://www.linux-kvm.org/page/Main_Page and other Open Source software. Its integration with various technologies such as "DRBD":http://www.drbd.org/ and LVM results in a cheaper High Availability infrastructure and linear scaling.
This hands-on tutorial will cover a basic overview of Ganeti, the step-by-step install & setup of a single-node and multi-node Ganeti cluster, operating the cluster, and some best practices of Ganeti. Finally, deploying and using a web-based management tool called "Ganeti Web Manager":http://code.osuosl.org/projects/ganeti-webmgr.
If attendees want to participate in the optional hands-on portions of the tutorial, there will be virtual machine "images available online":http://ftp.osuosl.org/pub/osl/ganeti-tutorial and at the tutorial itself. It’s recommended you download the image prior to the tutorial to save on setup time. We’ll be installing Ganeti on an Ubuntu VM and deploy instances using the "LXC":http://lxc.sourceforge.net hypervisor.
This tutorial will cover the following:
# Installing the base system and components
# Setting up the environment for Ganeti
# Operating a Ganeti cluster
# Deploying Ganeti Web Manager
For too long we’ve been taught that relational databases are the right tool for all jobs. While RDBMS are great for many projects what happens when a document database is the right tool for the job? When you turn to popular open-source document databases you find yourself having your relational background working against you. Exasperated developers end up asking questions like: how do I do a join or how do I group and query documents? What happens when I don’t have a table, a schema, rows or SQL statements?
When you’re thinking documents you normally start by thinking about the questions that you want to ask and create your model around those questions. In this interactive discussion we explore how this ends up being more intuitive than in relational databases and normalization.
Understand how people have been doing this for real life problems or, better yet, tell us your problem and let’s discuss how we can solve it using a document database.
Bring your questions!
More details soon...
by David Lazar, Michael Ernst and Werner M. Dietl
Are you tired of null pointer exceptions, unintended side effects, SQL injections, concurrency errors, mistaken equality tests, and other run-time errors that appear during testing or in the field? A pluggable type system can guarantee the absence of these errors, and many more real, important bugs.
Are you a software architect who wants to be able to quickly and easily implement custom checks that prevent more errors at compile time? You need a framework that supports you in creating a formally correct code checker.
This presentation is aimed at both audiences. A pluggable type system can give a compile-time guarantee of important properties. We will explain what it is, how to use it, and even how to create your own. You can create a simple pluggable type-checker in 2 minutes, and you can enhance it thereafter.
The demo uses the Checker Framework, which enables you to create pluggable type systems for Java. It takes advantage of features planned by Oracle for Java 8, but your code remains backward-compatible. The pluggable type-checker can be run as part of javac or via an Eclipse plug-in, and integration with build tools such as Ant and Maven is provided. The tools are "freely available":http://types.cs.washington.edu/checker-framework/.
The Checker Framework provides 12 pluggable type systems that are ready to use, including nullness, immutability, and locking checkers. The presentation will first develop a simple declarative type checker that checks the consistency of method signature strings. The presentation will then discuss the design and usage of more advanced checkers.
The Checker Framework has found hundreds of bugs in over 3 million lines of open source code, including from Oracle, Google, Apache, etc. Overall, we found that the type checkers were easy to write, easy for novices to productively use, and effective in finding real bugs and verifying program properties, even for widely tested and used open source projects. It is easy to improve the quality of your Java code, and you can start using the Checker Framework today!
At OSCON in 2007 we announced Windmill, designed to launch browsers and simulate users. User simulation in the browser is awesome and useful — but now it’s more important to be able to run all that JS logic that we care about so much. This spawned a complete re-thinking of the problem.
Jellyfish is a node module, which knows how to initialize the environment you need, facilitating a communication channel to tell it what to do. However, the best way to understand how the system works is best done with a giant awesome looking graphic, which I will be happy to provide and walk through.
How do you get Jellyfish setup to run that first ‘hello world’? It’s easy, let’s all do it together.
h2. Do things better
With one line of code, you can post everything your jellyfish scripts do and their results into couchdb, let’s talk about how to make that happen, and what you can do with those results.
h2. The future
I’ve had requests for environment support for Adobe Air, Ruby Racer, and Web OS — lets talk about what the development plan is for Jellyfish.
Jellyfish is fully Open Source software hosted on Github, and I would be happy to take your patches!
For more awesome, check out "http://jelly.io":Jelly.io.
So, you think you’ve got a great idea for a map, but little to no experience in map-making? Have you been making maps, but want to increase your understanding of what’s going on “under the hood”? Do you think maps are nothing but a bunch of latitude and longitude points? Then this session is for you.
We’ll introduce you to the concepts you need to know to really understand how maps work: technically, visually, and even socially. We’ll cover how your data gets from the real world to your screen, why every map is a lie, how to think about your data, and how to make your data pretty and understandable. Then we’ll talk about where to find data, and just as importantly, how to understand your data, and how to make sure the lie your map is telling is the one you mean it to. Finally, we’ll talk briefly about what open source tools are out there to help you manipulate and display your data.
A large percentage of recovery time during an unexpected outage is often spent determining the extent of the problem and its source. Tools that help localize the problem and quickly measure its severity are extremely helpful. The last thing you need during an outage is to have your mail server fall over, too.
And yet, why don't we have a general purpose solution to this?
This talk will explore designing error aggregation systems. We’ll cover effectively capturing events, efficiently processing them, and displaying the relevant information in real time. Error aggregators nicely compliment your existing logging systems and email systems, taking the heat when there’s a problem and intelligently rolling that data up for easy analysis during a crisis.
by phil tomson
Lots of attention has been given to GPUs for speeding up certain types of computations. While GPUs are very well suited for vector operations, there are other things they are not so well suited for. FPGAs (Field Programmable Gate Arrays) are not used as widely yet, but they offer a much more flexible computing fabric than GPUs. You can implement a GPU in an FPGA, for example, or you could implement your own custom processor optimized for very specialized tasks. The barrier to entry can be high for FPGAs: how does a person with a software development background get started using them? And what about HDLs (Hardware Description Langauges) used to program FPGAs? What's the difference between simulation and synthesis? What kinds of tools are freely available? These are some of the questions that will be addressed in this session.
Managing innovation, intellectual property and employment law, corporate finance, building a business plan -- my master's degree in technology management gave me some grounding in a bunch of suit stuff. I'll teach you a little of each of these, plus insights from my management experience and fish-out-of-water anecdotes. Aspiring executives welcome; ties optional.
by Andrew Baerg
I have been running a programming club with my kids and their friends for a couple of years. I would like to share what strategies I have learned, and show just how easy it is to get kids excited about programming. I have primarily been using Scratch as a tool and will be demoing some sample Scratch programs in the presentation.
This talk will give an overview of the OAuth 2 spec, starting with the various options the standard gives to developers for building web apps and native apps. We'll look at what the end user sees, work our way to what developers using an OAuth 2 API deal with, and we'll end up at what developers of OAuth-2-compliant APIs will need to know to successfully implement the standard.
Many large providers have recently deployed APIs using OAuth 2, including Facebook, Foursquare, Google, and more. But since OAuth 2 is technically still a "draft," many aspects of the spec change from month to month and it's sometimes hard to keep up. We'll cover the commonalities and differences between some of the major providers and draft versions. The security implications of some of the changes between versions 1 and 2 will be covered, along with recommendations for best practices. You'll also get a glimpse of the debates currently raging on the internal OAuth 2 mailing list.
by Donald Davis
Using Arduino and other open source tools, I will demonstrate that it is possible to create a wide variety of very complicated projects in a very short period of time.
As examples I will use 4 art pieces and a gadget built for a sporting event. While each uses the wiring/arduino platform in some way, they are remarkably different in what they do. I will discuss the process involved in putting each the pieces together as well as the tools in my bag of tricks.
by Jason LaPier
ePUB, the open e-book standard, is recognized by software and devices nearly unanimously (with exception of one device that starts with a "K"). I'll describe the collection of standards that were brought together to develop the EPUB standard, and then we'll talk about the structure of an ePUB document. I'll go over some of the gotchas and reader peculiarities to watch out for (it wouldn't be a "standard" without multiple interpretations!).
I'll show you how to build an ePUB document from scratch, and then we'll look at a few tools for making ePUB generation (and validation) easier. Once we've got a simple, straightforward document down, I'll go into fun things like adding styles, images, and embedded fonts. Syntax highlighting in an e-book? Code samples that don't break across pages? And at the end, we'll put together an ePUB document with some jumbled chapter sequences, unlocking the unintentionally ultimate purpose of the standard: to create books in the style of Choose Your Own Adventure (R).
Asking for money doesn’t have to be so hard. Whether you’re working on a non-profit, or a small side project. Or you’re bootstrapping a sweet new developer event. Or you want to find angel investing or venture capital… The same skills apply.
Come learn all about how to ask for money from Selena Deckelmann and Scott Kveton. Selena co-founded Open Source Bridge, raises money for the open source project PostgreSQL and has found funding for many small non- and for-profit projects. Scott fundraised for the Oregon State University Open Source Lab, launched a bacon-focused startup and is currently CEO of Urban Airship, a VC-funded Portland, OR company.
They’ll lead you through their process, their successes and failures.
You’ll leave with proven strategies for developing relationships, asking the right questions and providing the right information to people who want to give you their money.
There’s more real open source going on at Microsoft than you’d think. There are those working within MSFT to turn the ship around in a the least evil way possible. Ultimately we want folks to run Windows, sure, but the days of “our way or the highway” are coming to an end. Open source and source-opened is happening at Microsoft. The avalanche has begun and it’s too late for the pebbles to vote.
Have you always wanted to write more python, but you feel lost in the syntax, style and semantics? Well, Python hackers like to toss many phrases around to describe the style of their code; "Pythonic" being one of the most popular. But what does this really mean?
In this talk, we will go over:
Dive deep into the suffocating postmodern ocean of data and come out alive on this interactive tour of low-hanging-fruit data mining tools. Learn to make R graphs both pretty and pretty informative; crunch numbers quickly in high level languages, like a boss; even get your Google on with some maps and reduces. It is widely known: Data scientists have all the fun, so join us!
As part of the talk, we plan to make available a data set of historical Reddit front pages, to plumb the very depths of nerd humor evolution.
by Bart Massey
A while back, a distinguished mathematics lecturer at Oxford tried a little experiment in teaching mathematics and logic to younger students. He published a series of puzzle problems in the form of a whimsical story titled _"A Tangled Tale":http://www.gutenberg.org/ebooks/29042_. He invited readers to pick a pseudonym and post their proposed solutions and comments as he went along.
The year was 1880. The forum was the magazine _The Monthly Packet_. The lecturer was the Rev. Charles Dodgson, better know by his pseudonym Lewis Carroll.
We will discuss Carroll's approach to learning, comparing it to those used on modern electronic fora such as IRC channels, discussion lists, wikis and web forums. I will propose several lessons the open tech community can learn from Carroll's example: the promise of the storytelling style, the power of pseudonymity, and the right way of handling wrong answers. I will also suggest several potential limitations to this approach.
MariaDB 5.3, 5.5 and 5.6 has and will have a lot of new and improved NO-SQL features that can help you solve problems that you normally can't solve easily with SQL.
The talk will cover the improvements that has been done with the HANDLER interface, the new DYNAMIC COLUMNS interface and how to use the DYNAMIC COLUMNS to create storage engines that can access not relational data.
by Dawn Foster
The best thing about open source projects is that you have all of your community data in the public at your fingertips. You just need to know how to gather the data about your open source community so that you can hack it all together to get something interesting that you can really use. We'll start with some general guidance for coming up with a set of metrics that makes sense for your project. The focus of the session will be on tips and techniques for collecting metrics from tools commonly used by open source projects: Bugzilla, MediaWiki, Mailman, IRC and more. It will include both general approaches and technical details about using various data collection tools, like mlstats. The final section of the presentation will talk about techniques for sharing this data with your community and highlighting contributions from key community members. For anyone who loves playing with data as much as I do, metrics can be a fun way to see what your community members are really doing in your open source project.
21st–24th June 2011