I’m starting a new tag with this article—Technical—where I’ll cover technical subjects with little or no explanation of the jargon. These topics are not intended for the novice or the uninitiated. Everyone’s free to read and participate in the discussion, of course, but sometimes I feel like getting to the heart of the matter more than I feel like setting up the discussion.
Chris Ilias quoted Giganews support on why XPAT isn’t supported on Giganews servers:
The XPAT command attempts to search through our entire spool of over 700 million articles, to match on a specific keyword, that is often found only in a handful of newsgroups. The command puts enough of a load on our servers, that several people using this at one time can affect the performance that all of our customers receive.
As one of the posters in that blog pointed out, this is somewhat misleading. XPAT only searches one group; a GROUP command must preceed an XPAT command. That reduces the number of articles being searched by a great deal. But what if it could be reduced further still?
I posted this some time ago on Ilias’ blog: Giganews could have one server generate an index for XPAT to fgrep or search through with some faster indexed search. Then Giganews could offer XPAT but tell users that articles will not immediately appear in the XPAT search results. I believe that the indexer used in Beagle would be of assistance here; it’s free software (in the best sense: free to run, inspect, share, and modify).
This would let them define how often the XPAT data is created and contain how much CPU time and storage is used.
Also, Thunderbird should be extended to do the same brute-force searching online that it will do offline.