 |
|
|
|
|
| Added: Fri, Jun 18th 2004 17:48 UTC (4 years, 5 months ago) |
Updated: Sun, Oct 5th 2008 08:07 UTC (2 months, 0 days ago) |
|
|
About:
YaCy is a personal Web crawler and Web search engine. It's also a P2P-based Web index exchange network without a central server and no censoring possibility. Web crawls can be done locally, or you can trigger a collaborative Web crawl with all other YaCy peers. YaCy is fun to use and shows interesting text, image, audio and video search results with direct links to Ogg, MP3, and video files. It has a cooperative bookmark system and many Web publishing functions.
Author:
Michael Christen [contact developer]
Homepage:
http://yacy.net/
Tar/GZ:
http://yacy.net/release/yacy_v0.61_20081003_5246.tar.gz
Changelog:
http://yacy.net/Development.html
CVS tree (cvsweb):
http://svn.berlios.de/wsvn/yacy
Mailing list archive:
https://lists.berlios.de/pipermail/yacy-svn/
Demo site:
http://yacyweb.de/
Trove categories:
[change]
| [Development Status] | | 4 - Beta | | [Environment] | | Console (Text Based), MacOS X, No Input/Output (Daemon), Web Environment, Win32 (MS Windows) | | [Intended Audience] | | Advanced End Users, End Users/Desktop, System Administrators | | [License] | | OSI Approved :: GNU General Public License (GPL) | | [Operating System] | | OS Independent | | [Programming Language] | | Java | | [Topic] | | Communications :: File Sharing, Information Management, Internet :: Name Service (DNS), Internet :: Proxy Servers, Internet :: WWW/HTTP, Internet :: WWW/HTTP :: Dynamic Content, Internet :: WWW/HTTP :: HTTP Servers, Internet :: WWW/HTTP :: Indexing/Search, System :: Networking |
Dependencies:
[change]
No dependencies filed
|
|
» Rating:
8.20/10.00
(Rank N/A)
» Vitality: 0.15% (Rank 808)
» Popularity: 2.45% (Rank 2033)

(click to enlarge graphs)
Record hits: 33,967
URL hits: 9,930
Subscribers: 51
|
|
Branches
Comments
[»]
YAcY is a badly behaved robot
by Paul Gregg - Feb 27th 2006 17:43:51
1. YAcY doesnt ask for robots.txt, let alone follow it.
2. YAcY posts the yacy web address as the HTTP Refer[r]er header similar
to spam bots. Well behaved bots may put their url into the Agent
header.
I only came across this project whilst researching against HTTP Referrer
spammers, nice idea - shame about the implementation.
[reply]
[top]
[»]
Re: YAcY is a badly behaved robot
by Low012 - Mar 6th 2006 15:42:05
> 1. YAcY doesnt ask for robots.txt, let
> alone follow it.
> 2. YAcY posts the yacy web address as
> the HTTP Refer[r]er header similar to
> spam bots.
This issues have been resolved for some time now.
[reply]
[top]
[»]
Re: YAcY is a badly behaved robot
by Michael Christen - Mar 8th 2006 07:08:50
Both is not true:
1) YaCy respects the robots.txt since mid of 2005, it never ignored
robots.txt on purpose. At this time it was simply the first time
implemented.
2) There is no referrer spam. YaCy shows that the page was indexed by a
YaCy peer. Since the corresponding web page is referenced then not only by
this peer, but by all peers, there must be a central address where a
referred page must see that it was referenced by a non-centralized web
crawler. This is a unique problem that other centralized crawlers do not
have. In this case YaCy is just honest an references to the YaCy project
page. This feature was removed with YaCy 0.43 because of too many people
had been confused with this referrer.
[reply]
[top]
|
|
 |