> From: "Greg Linden" <http://www.findory.com/~glinden> > Date: Wed, 15 Feb 2006 16:18:27 -0800 > > Hi, Robert. Our crawl includes thousands of blogs and we add more every > week. However, because of the overwhelming number of spam, junk, or > otherwise inappropriate weblogs, we must manually review blogs before > including them in our crawl. Blogs that are not in our crawl will not be > ranked or change your personalization. > > I hope that you still enjoy Findory even if some of weblogs your Findory > Favorites do not contribute to the personalization of Findory. I think you misunderstand me. I wasn't suggesting that non-crawled feeds be included for *global* personalization, but rather only for *individual* personalization. That is, the Bayesian model could be maintained for an individual user. The global personalization data could be used in the training of this model, as well as any non-crawled articles that were clicked on in Favorites. This could even solve the "cold start" problem of when a new feed is eventually added to your crawled set. I tried sux0r briefly and it seems to use a naive bayes recipe; it is quite fast despite it being written in PHP. Anyway, just a suggestion. > Thanks, Robert. > > - Greg