grack.com

Google Desktop Search: How it Works, Pt 1

I tried out Google Desktop Search today and I decided to take a deeper look at how it works and how it integrates into your daily experiences.  This information all comes from reverse-engineering and file/registry observation.  None of it is guaranteed to be correct.

From looking at some of the PDB file references, I think the internal name of this Google search engine is "Total Recall".  This fits with the replacement string returned from Google ("<!--trs2-->") and the port number registry key "trs_port".

The search utility consists of three main applications and a number of "information provider" plugins.  The main applications are:

The plugins are:

The Winsock 1/2 interception is one of the cooler parts of the Google Desktop Search Application.  Each request you make runs through this filter.  Whenever a Google search is performed, the interception layer sends the requests to the local indexing server and merges the results with the web search results.  I verified this by running Windump on the machine and comparing the request made to Google with the results that Firefox received.

The BHO uses the GoogleDesktopAPI2.dll to add pages to the indexing service.  To take screenshots, it uses the GetDC function to grab the current bitmap from IE itself.  You'll notice that if any Windows are obscuring the IE window at the time the screenshot is grabbed, they'll appear in your thumbnails.

GoogleDesktopAPI2.dll has a number of unnamed imports.  Each of the search plugins loads these imports by ordinal and calls into them.  So far, none of the imports have been decoded.

More info as it comes!