We love Google’s new Desktop Search. We’ve been arguing about something like this for a year or more. The idea of searching everything you’ve seen — not just your hard drive, but everything hyperlinked to it (such as your surfing history) — is so intriguing we’ve built something similar for ourselves.

We’ve modified Nutch to search not just CommerceNet’s website, weblog, and wiki, but also everything we link to. Go ahead and try our index of CommerceNet’s Neighborhood. If you query for Nutch you won’t just see pages from our sites, but Nutch’s home page and even an application of it at CreativeCommons…

We’re also exploring a bunch of other ideas that GDS and other desktop indexing projects like StuffIveSeen haven’t tackled yet. Foremost is ranking — just like the AltaVista engine that Google itself dethroned, GDS doesn’t have anything like a PageRank for the gigabytes of information on your disk. Like many researchers, we suspect that a users’s social network is the key to discerning which hits are likely to be most useful. After all, if the Web is drowning in infoglut on any given query term, a user should be such an expert on the terms of his or her art that there ought to be even more hits to rank on localhost. One cure may be collaborative filtering with your friends…

Secondary aspects of the problem include tackling the fact that many of us have multiple computers and identities on the Internet, so we’d need networks of personal search engines. Or that a local-proxy-server approach might be better at capturing the ”dynamics” of our interaction (how often we re-read the same email over IMAP, say).

But rather than rattling off a longer list of half-baked hypotheses, I’d like to cite GDS for at least one idea that never occurred to us: integrating it seamlessly with the public site. Sure, we thought AdWords-like ads were the key to a better revenue model for the Fisher category of PersonalWeb products.

No, what’s cool is that Google’s ordinary results pages from the public website automatically include hits from your hard drive. How’d they ”do” that?! Read on…

CommerceNet Labs Wiki : FluffyBunnyBurrowsIntoWinSock

…we found that Google Desktop Server actually hooks into Windows’ TCP/IP stack to directly modify incoming traffic from Google’s websites to splice its local results in. Once you install GDS, there’s a bit of Google’s code running inside every Windows application that talks to the Internet.

It’s done using a long-established hook in WinSock2, its Layered Transport Service Provider Interface (SPI)…

The Winsock LSP is mostly only used by spyware and censorware; it’s a surprise to see a positive use for it. Spyware detectors like HijackThis consequently detect it.

[An aside: why is Rifkin’s GLAT posting more relevant on the query “rifkin fisher” than Rifkin’s actual Fisher posting? I think it’s Battelle’s fault, for increasing the GLAT’s PageRank! :-]