25 May 2009, Posted by Matthew Reinbold in PowWow,Tools, 5 Comments
Wolfram Alpha, Search Verses Extrapolation: Vox Pop Pow Wow 2009-05-18
Working in small, distributed teams can be an exercise in loneliness. The solitary existence can mean missing out on a shared revelation or not getting feedback on a forming assumption. The Vox Pop Pow Wows are a chance for a group of peers to get together over Skype, talk about the news of the day, and provide that professional support that we otherwise might go without. Here’s the transcript from a recent talk [edited for readability].
A. – Matthew Reinbold, Founder and Creative Principal, Vox Pop Design
B. – Matthew Orstad, Founder and Chief Engineer, Rocket Midwest
- Toward More Transparent Business
- Wolfram Alpha, Search Verses Extrapolation
- Small and Special Conference, Philosophy of Small Business
- Akoha and Alternative Currencies
- Selenium Web Driver, Functional Automated Testing
- Natural Docs, Auto-documentaion from Source Code
- Accounting for Freelancers, Quickbooks, Outright
Wolfram Alpha, Search Verses Extrapolation
A. So, straight off the top, did you have anything you wanted to cover before we get into things?
B. Have you heard of Wolfram Alpha? I hadn’t heard about anything about it. So apparently I’m in the wrong nerdy circles until after the fact. But did you, did you play around with an Alpha at all?
A. I did not. Did you?
B. Not really, because I couldn’t get on it.
A. Everything that I’ve read is second hand – I haven’t had any first hand experience. It’s an intriguing idea but, apparently, Sky Net has not yet reached sentience. It does some things really really well but as soon as you go outside those programmed rails, you pop the track and things become disastrous. One of the biggest concerns that I’ve read is just how they’re processing information. Where as Google is pretty much real time and up to date. I think it was Mashable that had the story that if you search for the population of Philadelphia on Google you get the current number. If you search for the population of Philadelphia on Wolfram you get some very nice charts but it’s 2006 data. That’s supposed to be one of this biggest challenges with how they’re processing things: keeping data up to date. Everything is data and it goes into these massive databases that are on super-computers. The other thing is you have these algorithms that process massive amounts of information requiring super-computers. Meanwhile Google, if I remember correctly, is using all off the shelf, cheap, hardware in server farms, redundant, concurrent, parallel; affordable stacks of Linux. Will Wolfram be able to scale as well?
B.I performed a couple searches and it looks like they’re using some sort of Java thing. That isn’t terribly surprising. It said “too many concurrent connections” and I just kept thinking it was my connection or something. It’s probably got a severe bottleneck issue.
I think it’s interesting. I had just heard about it just like two days ago when they did the beta launch. I hadn’t heard about it before but I watched a video. I can see the potential .Of course to everybody it’s a search engine sort of thing just because its got a text box you put terms in “Oh it must be a Google search engine clone.” But it really isn’t, not from everything that I’ve watched. It’s not a general search engine like Google or Yahoo or MSN. It’s more like “we have a big huge data set – let’s do something mathematical”, you know, “make it computable”.
A.Uh-huh
B. Like “let’s take Wikipedia and be able to infer all of these things mathematically”, “define all these relationships and algorithmically”, and crunch like huge, crazy numbers.
A. Right
B. It is pretty fascinating just the sort of stuff you can say. Say you were doing some of the medical examples like asking “what’s the LDL level for a 46 year old male in the UK?” Then correlation studies: “is there a higher correlation between these two things” and that’s like WOW! I can just think of a lot of sort of low level questions that used to probably require a lot of digging and number crunching that, if they can get the kinks worked, all of those questions are basically answered.
A. Like, you have some examples? I mean, beyond just the medical one?
B. Those are the main ones I see now. There’s a danger of correlation impling causation but I can see a lot of social uses. Like, “let’s do crime rates” or “let’s do law enforcement”. You know, all of these sort of statistical data sets. Then ask all sorts of interesting questions that before you needed maybe a PHD or something to program this into whatever statistical package you were using.
A. Right
B. And to dig through terabytes or gigabytes maybe even…
A. I don’t doubt the complexity of the algorithms that are running. Using a data set that large and being able to pull answers back and format them is incredible. I don’t know, we’ll see. For the Alpha launch they were going to stream it live over Mashable. Or on Mashable through Ustream. There were some issues with how they were doing it; there was some last second glitch, so it’s an incredibly complex piece. I just don’t know how brittle it is. The thing about Google is that it’s robust and real time and it self adjusts right? Because the algorithm is all based on link backs as people’s opinions change, as their attitudes change in formats, like blogs, it’s updating the engine. It’ll be interesting.
