[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Findory misfeature: unknown blogs/RSS feeds



 > From: "Greg Linden" <http://www.findory.com/~glinden>
 > Date: Wed, 15 Feb 2006 16:18:27 -0800
 >
 > Hi, Robert.  Our crawl includes thousands of blogs and we add more every 
 > week.  However, because of the overwhelming number of spam, junk, or 
 > otherwise inappropriate weblogs, we must manually review blogs before 
 > including them in our crawl.  Blogs that are not in our crawl will not be 
 > ranked or change your personalization.
 > 
 > I hope that you still enjoy Findory even if some of weblogs your Findory 
 > Favorites do not contribute to the personalization of Findory.

I think you misunderstand me.  I wasn't suggesting that non-crawled feeds
be included for *global* personalization, but rather only for *individual*
personalization.  That is, the Bayesian model could be maintained for an
individual user.  The global personalization data could be used in the
training of this model, as well as any non-crawled articles that were
clicked on in Favorites.  This could even solve the "cold start" problem
of when a new feed is eventually added to your crawled set.

I tried sux0r briefly and it seems to use a naive bayes recipe; it is
quite fast despite it being written in PHP.

Anyway, just a suggestion.

 > Thanks, Robert.
 > 
 >   - Greg




Why do you want this page removed?