Powerset Opens Up
I had the pleasure of attending a small gathering at Powerset’s office in San Francisco this evening to get a preview of what’s to come from the highly publicized start up. A lot of interesting demonstrations were given and we heard a lot about the technology and platform they’re building.
Powerset is based around Natural Language Processing and to that effect they’ve licensed key technology from Xerox PARC to better index the web. Where Google focuses on the statistical features of web pages, Powerset’s technology examines the actual meaning and relationships of words in each sentence. It important to note that this processing isn’t applied solely to the web, but also the search queries. For examples of how this is superior you can check out Powerset’s blog and learn how Powerset can kick Google’s butt at figuring out “What Steve Jobs said about the iPod”. (I’m sure all the people I saw camped out at the Apple Store on my walk back to my hotel would be interested in that.)
The NLP technology they’re using isn’t new, in fact it’s been under development for 30+ years now. What they’ve managed to do, and what they think will keep them ahead of the competition, is the fact that Powerset has reduced the processing time for indexing one sentence down from two minutes to one second. The plan is to “ride Moore’s Law” to make processing the web as a whole a possibility. Currently they’re limited to a select few sites to crawl: Wikipedia, New York Times and ontological resources like Freebase and WordNet. Their improvements certainly sound impressive… but if you take Wikipedia’s almost 2 million pages, each with an average length of 25 sentences it would take almost a year and a half to index the information. Yikes. Spreading that work across Powerset’s ~750 servers would still require nearly a day to process. Of course they’ll be using Amazon’s EC2 and building out their data centers to improve this. Even so, expanding this beyond Wikipedia and indexing billions of web pages may prove to be problematic. Their next goal is to index blogs… and I wish them luck. According to Technorati that’s 1.7 million new posts per day.
The real challenge will be getting users transitioned (back?) to using natural language search. Keywords do a pretty darn good job as a shortcuts and pointing us in the right direction. Most keyword based searches handle the most common question about an entity by default… and Google sends us on our way… fast. Being blazing fast is definitely key given how quickly people can become dissatisfied when searches take longer than a second. From the demo earlier today, it looks this delay will definitely affect Powerset. Most of their example queries took a second or longer to complete. Performing the semantic analysis of the search query takes time and a lot of computational power which leaves Powerset at a disadvantage. Many employees were quick to point out that their technology falls back to keyword based search if their semantic analysis fails, but the delay involved could prove to be their undoing. They’re also not Google (or any another keyword focused search engine), so for many short queries they may not be competitive.
It’s quite timely to note from Dare’s notes on Google’s Scalability Conference:
NOVICE QUERY: Why doesn’t anyone carry an umbrella in Seattle?
EXPERT QUERY: weather seattle washingtonNOVICE QUERY: can I hike in the seattle area?
EXPERT QUERY: hike seattle areaOn average, it takes a new Google user 1 month to go from typing novice queries to being a search expert. This means that there is little payoff in optimizing the site to help novices since they become search experts in such a short time frame.
I’d contend that most users have already made that jump and that a lot of younger users are conditioned to search via keyword. I’m convinced that with the evolution of text messaging people have begun to natively communicate with keywords. This may be a temporary state however. Verbal communication still retains a lot of the richness that Powerset will be poised to take advantage of, especially in the mobile space. They’re definitely looking at this and have key team members with experience in the area. It will remain to be seen if Powerset can pull together a strong enough product within the next 9 months, which is the amount of time they have until they “have” to go live.
So I’ve spewed forth a lot of doom and gloom about a company that I’m really excited about. What gives? Steve Newcomb, COO and Founder of Powerlabs, has said that he wants his company to be the most open start-up out there. They’re going to publish their predictive modeling for data center growth to match their user growth. Ruby developer Kevin Clark said they’re looking to release some of their internal tools around packaging, monitoring and more. Newcomb also talked about how they were sharing some of their innovative practices with other startups. There’s a large sense of giving back to the community… and they want the community to help decide where their company goes.
Speaking of community, Powerset will open up a section of their site called Powerlabs in September. Their goal is to create a social networking site focused around their technology. They’re off to a good start with over 10,000 people registered already. According to current thinking, users will be able to suggest and vote on features, make profiles and gain reputation. Like most of Powerset this is all subject to change. Even the user interface for Powerlabs is radically different from screenshot released last week. Ultimately users will be able to submit ideas on how to shape Powerset’s core product, even it’s name, something that Newcomb admitted isn’t set in stone.
To me it sounds like we have a company with a great underlying technology that doesn’t quite know how to apply it to the end user or if the end user will even like it. In the months approaching the September launch I believe Powerset will attempt to address some of the technology issues and continue to search internally for their killer-app that is good enough with the existing limitations. Barring that they’ll have a ton of buzz, a kickass staff, a handful of good ideas and a technology with a lot of potential. Sounds like a great target for acquisition, right? Can you think of any companies out there that already know how to scale? In the meantime the genuinely altruistic management will help other startups with the tools and processes they’ll need to ultimately be successful… or they’re just making it easier to start their next company. Everyone wins.
Comments
Leave a Reply