Hal Varian on the Need for Data Interpreters

Hal Varian, Google’s chief economist, gave a nice summary of a major need of our era.

Emphasis added:

“The ability to take data—to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it—that’s going to be a hugely important skill in the next decades, not only at the professional level but even at the educational level for elementary school kids, for high school kids, for college kids. Because now we really do have essentially free and ubiquitous data. So the complimentary scarce factor is the ability to understand that data and extract value from it.

“I think statisticians are part of it, but it’s just a part. You also want to be able to visualize the data, communicate the data, and utilize it effectively. … being able to access, understand, and communicate the insights you get from data analysis —are going to be extremely important.”

Hal Varian, Google’s Chief Economist, 2009

KazAnova on Stacking: leveraging multiple machine learning algorithms for better predictive models

Machine learning can be a powerful tool in the creation of predictive models. But it doesn’t provide a magic bullet. In the end, effective machine learning works very much like other high-value human endeavors. It requires experimentation, evaluation, lots of work, and a measure of hard-earned wisdom.

As Kaggle Competitions Grandmaster Marios Michailidis (AKA KazAnova) explains:

No model is perfect. Almost every time the models make mistakes. Plus, each model has different advantages and disadvantages and they tend to seize the data from different angles. Leveraging the uniqueness of each model is of the essence for building very predictive models.

To help with this process, David H. Wolpert introduced the concept of stacked generalization in a 1992 paper.

Michailidis explains the process as follows:

Stacking or Stacked Generalization … normally involves a four-stage process. Consider 3 datasets A, B, C. For A and B we know the ground truth (or in other words the target variable y). We can use stacking as follows:

  1. We train various machine learning algorithms (regressors or classifiers) in dataset A.
  2. We make predictions for each one of the algorithms for datasets B and C and we create new datasets B1 and C1 that contain only these predictions. So if we ran 10 models then B1 and C1 have 10 columns each.
  3. We train a new machine learning algorithm (often referred to as Meta learner or Super learner) using B1.
  4. We make predictions using the Meta learner on C1.

As part of his own PhD work, Michailidis developed a software stack, named StackNet to speed up the process.

Marios Michailidis describes StackNet in this way:

StackNet is a computational, scalable and analytical framework implemented with a software implementation in Java that resembles a feedforward neural network and uses Wolpert’s stacked generalization in multiple levels to improve accuracy in classification problems. In contrast to feedforward neural networks, rather than being trained through back propagation, the network is built iteratively one layer at a time (using stacked generalization), each of which uses the final target as its target.

StackNet is available in GitHub under the MIT license.

Be sure to read the interview with Michailidis about stacking and StackNet on the Kaggle blog, here.

 

 

 

 

Strategy Tips for Kaggle Competitors

Martin O’Leary recently posted some sound advice for Kaggle competitors. You can find the three-graph version in the Kaggle wiki.

Here I’ll break it into four key points:

  1. Spend a while on visualization, making graphs of various properties of the data and trying to get a feel for how everything fits together.
  2. Test the performance of a variety of standard algorithms (random forests, SVMs, elastic net, etc.) to see how they compare. It’s often very informative to look at which data points are the least well predicted by standard algorithms, as this can give you a good idea of what direction to move in. (Be warned: Home-brew algorithms can be useful later on in a project, but in the early stages you want to try out as many things as possible, not get bogged down in the details of implementing a particular algorithm.)
  3. Then move into the nitty-gritty details once you have a sense for the lay of the land.
  4. Of course, all this assumes a certain kind of problem, where the data is already in numeric/categorical form. For more “interesting” datasets, such as the recent Automated Essay Scoring competition, a lot of the early work is in feature extraction — just looking for numbers which you can pull out of the data. That tends to be a bit more creative, and I use a variety of tools to see what works best. However, one of the joys of this kind of problem is that every one is different, so it’s hard to give general advice.

Sam Altman and Y-Combinator

Tad Friend at The New Yorker has just published a great snapshot of Sam Altman’s leadership at Y Combinator. It is worth a read for anyone interested in topics like:

  • technology and human progress
  • generating the next phase of economic growth
  • A.I. and transhumanism
  • crowd-funding a smart city
  • etc.

The following excerpts give a hint at the full contents of the article.

On Altman’s ruthless enthusiasm for big, future-transforming ideas:

“Altman is rapidly building out an economy within Silicon Valley that seems intended to essentially supplant Silicon Valley—a guild of hyper-capitalist entrepreneurs who will help one another fix the broken world. Everyone has cautioned him against it.”

On A.I., human limitations, and technological possibilities (embracing the Singularity?):

“On  a daylong hike with friends north of San Francisco, Altman relinquished the notion that human beings are singular. As the group discussed advances in artificial intelligence, Altman recognized, he told me, that “there’s absolutely no reason to believe that in about thirteen years we won’t have hardware capable of replicating my brain. Yes, certain things still feel particularly human—creativity, flashes of inspiration from nowhere, the ability to feel happy and sad at the same time—but computers will have their own desires and goal systems. When I realized that intelligence can be simulated, I let the idea of our uniqueness go, and it wasn’t as traumatic as I thought.” He stared off. “There are certain advantages to being a machine. We humans are limited by our input-output rate—we learn only two bits a second, so a ton is lost. To a machine, we must seem like slowed-down whale songs.”

On the idea of a crowd-funded smart city:

“Recently, YC began planning a pilot project to test the feasibility of building its own experimental city. It would lie somewhere in America, or perhaps abroad, and would be optimized for technological solutions: it might, for instance, permit only self-driving cars. “It could be a college town built out of YC, the university of the future,” Altman said. “A hundred thousand acres, fifty to a hundred thousand residents. We crowdfund the infrastructure and establish a new and affordable way of living around concepts like ‘No one can ever make money off of real estate.’ ”

On the preference for action over caution:

“For Altman, the best way to discover which future was in store was to make it. One of the first things he did at OpenAI was to paint a quotation from Admiral Hyman Rickover on its conference-room wall. “The great end of life is not knowledge, but action,” Rickover said. “I believe it is the duty of each of us to act as if the fate of the world depended on him. . . . We must live for the future, not for our own comfort or success.” Altman recounted all the obstacles Rickover overcame to build America’s nuclear-armed Navy. “Incredible!” he said. But, after a considering pause, he added, “At the end of his life, when he may have been somewhat senile, he did also say that it should all be sunk to the bottom of the ocean. There’s something worth thinking about in there.