Blog

“LinuxWorld SF: OSDL Announces Patent Commons Project”
IDG News Service (08/10/05); Nystedt, Dan
The Open Source Development Labs (OSDL) in concerned that software patents are having a detrimental effect on open-source collaboration, and mitigating that threat is the goal behind the Patent Commons initiative the organization announced on Aug. 9. The effort will involve the collection of software licenses and patents pledged to the open-source community within a single repository for developers. The Patent Commons will also serve to lower the threat of patent-related lawsuits and ease the administrative burden of approving individual licenses, thus encouraging more companies and individuals to contribute their intellectual property to the open-source community. Vendors who make such pledges are basically promising not to pursue litigation against developers or users. The Patent Commons also ensures patent holders that an organization committed to open-source software is looking after their patent enforcement rights. The project will initially concentrate on the development of a library and database to store software patents and patent licenses, in addition to patents pledged by companies. The OSDL said other legal items, such as indemnification programs offered by open-source software vendors, will also be aggregated.

LinuxWorld SF: OSDL announces Patent Commons project – Computerworld
News Story by Dan Nystedt
AUGUST 10, 2005
IDG NEWS SERVICE

The Open Source Development Labs, a group dedicated to promoting Linux, announced a new initiative called Patent Commons, to collect the software licenses and patents pledged to the open source community into a central repository to make them easier to access by developers, and encourage more patent holders to pledge their intellectual property to the cause.

The move will increase the utility of the growing number of patent pledges and promises in the past year by providing a central location for open-source software developers, OSDL said yesterday. It will also reduce the threat of patent-related lawsuits, the group said.

The move will also encourage more companies and individuals to pledge their intellectual property to the open-source community by reducing the administrative headaches posed by granting individual licenses, which the OSDL said is a barrier to the formal licensing of patents.

The OSDL hopes the measure will help encourage more companies to pledge their IP to the open source community, a group that includes IBM Corp., Nokia Corp., Novell Inc., Red Hat Inc. and Sun Microsystems Inc.
Vendors like these that pledge their patents to the Patent Commons project are, in general, promising not to file lawsuits against developers or users. At the same time, patent holders will be assured that the right to enforce the patents is watched over by an organization dedicated to open-source software, OSDL said.

The group said software patents are a huge potential threat to the ability of people to work together on open source.

The project will initially work on a library and database to house software patent licenses and patents, as well as patent pledges made by companies. It will also collect other legal items, such as indemnification programs offered by vendors of open-source software, the group said.

The Patent Commons project is still in the planning stages, the OSDL said, adding it expects to announce more details in coming months

October 18, 2005/by ams

Intel’s “lablet” collaboration strategy

Innovation

Although Intel funds the labs, it doesn’t own the intellectual property, and the research is widely shared and published, Teixeria says. Intel won’t disclose how much it’s spending on its university research projects, but its overall R&D budget is expected to exceed $5 billion this year.

The real goal, Teixeria says, is to see if the labs can unearth something that Intel might then be able to take in-house and develop further.

“It’s this notion of both helping to grow the technology and seeing where there is a usage for it within Intel,” he says.

Does “accelerate” include pointing to technologies that have already become startups, like Xen and variants of sensors? :-)

The reference to parallel search of vast, unindexed data is intriguing — it reminds me of the scale of challenges the wayback machine is facing for the Internet Archive — how could P2P help a 40TB+ search problem, given that one is willing to trade off longer response times against much lower (centralized) costs/better-shared costs?

ACM News Service

“Intel Goes to School”
Computerworld (03/28/05) P. 40; Vijayan, Jaikumar

Intel Research is funding a quartet of university “lablets” to identify and investigate technologies that merit “acceleration and amplification,” according to company representative Kevin Teixeria. He says Intel has no claim on the intellectual property produced by the labs, because it is interested in “helping to grow the technology and seeing where there is a usage for it within Intel.” Intel’s UC Berkeley lablet is focusing on systems that employ wirelessly networked sensors to collect a wealth of information about the environment, and the TinyOS operating system and TinyDB query-processing technologies have been notable breakthroughs. Researchers are currently devising the Tiny Application Sensor Kit, a suite of tools that lab director Joseph Hellerstein says will simplify the deployment of applications that use sensor networks. A second Intel lablet at the University of Washington is combining radio frequency identification (RFID) technologies and data mining software into the System for Human Activity Recognition and Prediction, which is supposed to predict human behavior by monitoring the objects people touch and how they are used. A key tool of this research is the RFID-enabled iGlove that extracts data from objects with affixed RFID tags. Another lablet based at England’s University of Cambridge under the supervision of Derek McAuley is looking into highly distributed applications, examples of which include Xen, a “virtual-channel processing” technology that allows a single system to support multiple operating systems and users more efficiently than software-based virtualization. The Carnegie Mellon University lablet’s area of concentration is software for widely distributed storage systems, with emphasis on interactive searching of massive archives of non-indexed data, and the acceleration and enhanced accuracy of searches via embedded processors.

The Carnegie Mellon Intel lablet is investigating software for widely distributed storage systems. Researchers working with Seagate Technology LLC are trying to enable interactive searching of terabyte-size collections of nonindexed data.

As part of that effort, researchers are studying how to speed up searches and make them more accurate by embedding processors either close to or on storage devices so they can examine and discard irrelevant data close to the source.

October 18, 2005/by ams

Yahoo! Research Labs’ new head

Innovation

Yahoo’s new search master | Between the Lines | ZDNet.com
-Posted by Dan Farber @ 11:48 am

The arms race for scientists with expertise in various areas of search, data mining and data analysis is in full flower, as in the tug of war between Google and Microsoft over the services of Kai Fu Lee. Meanwhile, Yahoo is making a major investment from its nearly $500 million annual engineering spend to build out its own world-class research group.
In fact, based on my conversation yesterday with Prabhakar Raghavan, the new head of Yahoo’s research group, Yahoo has its sights set on Nobel prizes and making breakthroughs to ensure the future of the company. I don’t think he was exaggerating. Search and creating more personalized user experiences that take advantage of underlying data and relationships is still in an infant phase. Yahoo, Google, Microsoft, Amazon and other major players understand that the spoils will go to those who provide answers, rather than links, and develop ways in which billions of consumers and creators of content can participate in an economic and social value chain.

Raghavan, who spent 14 years doing search and data mining-related research at IBM and was lured from his stint as CTO of enterprise search vendor Verity, told me that he intends “to go after the best in world and to get them.” He said that Yahoo will be able to attract top talent because of its stable and profitable business and the opportunity to impact Yahoo’s audience, who account for 12 to 15 percent of all the Web activity worldwide (Yahoo’s numbers). “We have an amazing outreach,” Raghavan said. “Ten terabytes of data, which for a scientist is pretty appealing.” Raghavan is also well connected in the research community-he is editor in chief of the prestigious Journal of the ACM.
For scientists with expertise in information retrieval, computational linguistics, machine learning, matrix and graph algorithms, unsupervised clustering, data mining and related areas, it’s like the U.S. housing market. They’ll have multiple bidders and command a premium. Raghavan said that Yahoo would stay away from high-profile hires and focus more on university researchers, as well as hiring college student interns and grooming them for jobs. Yahoo also recently formed a research center in association with the University of California, Berkeley. That said, he was able to recruit a well-regarded colleague, Andrew Tomkins from IBM.
Raghavan noted that his group will be active and open to the research community. “We will publish our research and interact with peers–it’s critical to the success of a research organization. There is an obvious aspect of marketing, PR and being visible contributors of ideas to the community. That said, we will not take every trade secret and publish it. It’s a challenge other industry leaders have solved before. We will publish and be judicious about how we do it.”
Raghavan has been in the job just over a month, but he has been impressed by what he called the “thirst for ideas that flow form research to the business.” He acknowledged that moving research into products is a challenge. He listed improving search, building a better advertising platform, making better sense of social media, large-scale distributed computing, and developing incentive structures and tools as his goals.
Regarding search, Raghavan said, “We have two views of better search. Most people are not interested in search-they want to get things done. The future has to be more friendly to people getting tasks done. You don’t want to spend two weeks of evenings sitting at a keyboard and piecing together a vacation plan. You want a system to go out and find the answers, based on future technology that goes beyond crawling and indexing pages.”
That future technology, according to Raghavan, is diving into the “deep Web” and semi-structured queries. “I hesitate to use the buzzword of ‘Semantic Web’–but it is about entity extraction, XML queries, unstructured queries, semantic ambiguity. We have to build a view of the world. When you issue a query, it has richer view than a text index. We’ll start to see manifestations of this in five years.”
On the back end, Raghavan wants to solve the problems like spam and to “align the commercial incentives of a billion content providers with social good intent.” He pointed to the field of mechanism design, a sub-field of microeconomics and game theory, as key to creating economic models that encourage people to participate in a clean, well-lighted digital marketplace with billions of content creators and consumers.
“We want to inspire the audience to give more data and more. If someone creates a snippet of music and others remix it and it finally becomes a hit, how do you divvy up the proceeds amongst all the constituents? That [economic incentive network] has to be figured out. There is a lot of microeconomics that is not fully understood, and it’s one of the areas we want to understand. There will be Nobel Prize in economics award for this stuff, and I wouldn’t be unhappy if it came from our group.”
Along those lines, Raghavan and Jon Kleinberg authored a paper recently entitled “Query Incentive Networks,” which looks at networks of interacting agents as economic systems, in which “users seeking information or services can pose queries, together with incentives for answering them, that are propagated along paths in the network.”
Yahoo wants to turn its fragmented set of services, content and marketplaces into a cohesive whole and to aggregate, distribute and monetize the creative output of its users. “We have a plethora of opportunities looking at different social networks, such as blogs, instant messaging, My Web, Yahoo 360, and other services, across Yahoo properties,” Raghavan said. Yahoo’s social search engine My Web 2.0, for example, allows Yahoo users to archive, tag and annotate search results and share them with other people using the service. Users can also search their contacts’ My Web and browse content that others on Yahoo’s network have shared.
But determining what data from the pools of Yahoo services and billions of inputs is useful to people and will create a breakthrough in the user experience is one of his team’s challenges. “It’s a classic problem in statistical machine learning-you might have 200 data points, but how do you zero in on the three that make a difference?”
As part of Yahoo’s Research initiative to harness the activity on its properties in ways that create new revenue streams and sticky user experiences, Raghavan’s team will be racing its competitors to come up with standards and methods for determining value, incentive systems, frictionless payments and rights management. “We will let the market determine what is interesting and those who contribute the interesting stuff will get rewarded,” Raghavan said.
However, without standards across user networks, every site will be a cul-de-sac. An incentive system on one site will not interoperate with another site. It’s like requiring users to have a different card for every kind of ATM machine. I asked Raghavan whether users should have access and control to the data collected by Yahoo. “Users should have control of what data is collected or given up and knowledge of what is done with it,” he said. “Giving every person their clickstream doesn’t make a lot of sense-most don’t want it-but they should have knowledge and control.”
However, Raghavan supported the concept of being able to exchange your data collection-such as your Amazon or Yahoo shopping clickstream and forms input-with another site. “The data belongs to the user because it’s about the user, but we are not at a point today where multiple shopping sites can exchange data. It’s metadata challenge, but it’s more of a standards activity, not a research issue.
In addition, his group is working on aspects of personalization. “Personalizing is a loaded word, and it sometimes gets trivialized. It’s not about customizing the colors on the MyYahoo page,” Raghavan said. “It’s more of a social phenomenon that takes into account what others are doing, especially people like yourself. Content, context and community coming together is a long-standing dream in our business-we are all going after it. But, the catch is when the user is not only a consumer but also creator of content. It leads to interesting possibilities in tandem with data mining and the user experience. You have to decide what content to show that users will find valuable, and not irritate users with too much content.”
Raghavan has also spent time looking at how to mine blogs for predicting the movement of products and developing new user experiences. “We are looking at sources of information– text, photos, podcasts–whatever we can mine from the back end. Then we look at what users want, and bring the two together to create an application from all chatter going on,” Raghavan said. “We can dream up cool experiences, but they have to be grounded in product reality. As we develop technology, markets start to react, so mining begets a reaction from market and begets more mining, so we are constantly working on more scenarios.”
Underpinning all of Yahoo’s–as well as every other megasite’s–dreams of growing to billions of active, transacting, content creating and consuming users is the ability to build an efficiency platform with millions of computers and data sets distributed around the globe. With 345 million unique users per month across 25 countries and in 13 languages, Yahoo, as well as its competitors–especially Google–has some experience in planetary scale computing.
While the progress over the last ten years of the Web has been significant, we are still in the Stone Age of search, social networks, incentive models and personalization. With the competitive juices flowing in research labs, and wide open commercial opportunities, the next ten years will be more about answers than links, but not without some serious flailing%u2026

October 18, 2005/by ams

Siemens tries its hand at incubating startups

Innovation

…not all of which are spinoffs of its own; some are strategic investments in external inventions they believe they need. This is another great isgn of the realignment of expectations for corporate venture activities…

Technology Start-Ups Get Chance
To Grow With Siemens Backing
By DANIEL ROSENBERG
DOW JONES NEWSWIRES
October 13, 2005; Page B4

Deep within Siemens AG, one of the world’s largest companies, lies a tiny technology incubator.

In Berkeley, Calif., eight small companies, each with the financial backing of Siemens, are in the start-up stage. Siemens backs the companies — most of which share space in the same building and some of which began with the proverbial “inventor-in-a-garage” status — through a program it started six years ago called Technology to Business.

The program provides companies with seed-stage financing of around $500,000 and helps with early commercialization. In return, Siemens gets a percentage of each company and access to new technologies that can aid the German engineering giant’s own businesses.

“TTB was created as a model to bring technology and innovation that are outside into Siemens,” said Stefan Heuser, president and chief executive of TTB since last year. “It’s an outside-in approach. We’re like an early-stage investor.”

The program looks for technologies that fit into Siemens’s businesses, but doesn’t prevent the small companies from eventually seeking outside venture financing and selling to other customers. The companies can move out on their own when they are ready. However, some entrepreneurs who hitch up with TTB do so knowing Siemens will make a solid customer for their products.

For instance, Amine Haoui, CEO of wireless-sensor company Sensys Networks, came aboard TTB two years ago, hoping Siemens would have a number of applications for his technology.

“I felt very strongly that, in our type of business, a partnership with a large corporation from the get-go would be very useful,” said Mr. Haoui, whose company makes traffic-monitoring systems used by government agencies. “With Siemens being the dominant traffic vendor in the world, it made a lot of sense to me. We’d know the market a lot better and get faster access to customers.”

Within two months of receiving financing from Siemens, Mr. Haoui and his team were getting a grand tour of Siemens’s industrial divisions in Germany and the U.S. “We got access to people it would have taken two years to [meet] on our own,” Mr. Haoui said.

Aleks Goellue, CEO of PINC Solutions, whose technology helps companies track products before they are shipped to customers, said it wasn’t a difficult choice to couple with Siemens’s TTB.

“Even though my previous start-up was funded with traditional venture capital from Day One, I preferred not to directly try to fund-raise” this time around, Mr. Goellue said. “Back in 1998, you could just present your idea. Now, VCs want a lot more traction,” he said, referring to venture-capital investors.

Another advantage is that at the TTB facilities, Mr. Goellue said he is able to exchange ideas with both Siemens’s employees and with people from other seed-stage firms whom he bumps into in the hallway.

In return for the seed-stage investment in Mr. Goellue’s company, Siemens obtained an ownership stake and also will have contracts under which PINC will build certain products for Siemens. Although PINC plans to seek additional financing from traditional venture-capital firms, “the relationship with Siemens will stay even when we graduate out of TTB,” Mr. Goellue said.

Could the relationship with Siemens become an impediment? Mr. Haoui doesn’t think so.

“We have an investment from Siemens and a partnership with them, but there are no strings attached,” he said. “There were instances where we talked with Siemens’s competitors in the market and I had to explain to them that we could work with them if we wanted to.”

Siemens’s Mr. Heuser compares the difference between classic corporate research-and-development efforts and TTB to that between farming and hunting. “You work on technologies for years and develop things, just like developing a crop and harvesting it again and again,” he said.

But with TTB, “The idea behind this was to go in a more hunting direction, looking for technologies outside of Siemens’s R&D, looking for ideas from the start-up community and universities,” Mr. Heuser said.

Siemens and a host of other companies have venture-capital arms, and 40 are corporate members of the National Venture Capital Association. But Siemens goes further than most with TTB and other programs, said David Spreng, managing partner with Crescendo Ventures and a board member of the NVCA.

October 18, 2005/by ams

Congrats on MonkeyDo!

Innovation

Mark Pilgrim has radically transformed MagicLine into MonkeyDo, a new kind of “browsing assistant” that guesses what kind of page you’re on and offers to post it to your personal del.icio.us bookmarks log with the appropriate tags.

I’m not sure he’d agree that there’s a line connecting the dots of MagicLine and MonkeyDo — you can see his attached description for clarification — but I see them both as ways to “evacuate” data detected while surfing to a semi-stable remote store, for eventual reuse such as filling out forms.

His package also includes a ridiculously useful — as in ridiculously 1) complicated and 2) annoyingly absent from the DOM — subroutine that ‘uninherits’ all cascaded style sheets (though Aaron Boodman followed up with an alternative CSS reset technique).

Mark Pilgrim announces MonkeyDo

Think of it as Clippy the Useless Office Assistant, only for the web,
and actually useful. (I actually considered naming it Cl.ip.py, but thought better of it.) It sits in the background and watches as you browse, and if it recognizes a type of page that you consider interesting (as defined in Tools –> User Script Commands –> MonkeyDo options), it will offer to post it to del.icio.us. Or if you prefer, you can tell it to automatically post certain types of pages, and it will simply notify you when it has done so.

The heuristic for identifying different types of pages is, of course, somewhat messy, and will inevitably lead to embarrassingly hilarious mis-identification, which someone will no doubt bring to my attention.

Mark’s announcement of MagicLine

It tracks your browsing and collects
– page URLs
– page titles
– referrers
– Author, description, keywords from meta tags
– Technorati tags from rel=”tag” links
– XFN links
– autodiscovered RSS/Atom feeds
– autodiscovered FOAF files

Then you can press Control + Shift + L anywhere to get the MagicLine prompt. Start typing, and it autocompletes based on all the data it’s collected so far.

August 17, 2005/by ams