arachnode.net
An Open Source C# web crawler with Lucene.NET search using SQL Server 2008/2012/2014/2016/CE An Open Source C# web crawler with Lucene.NET search using MongoDB/RavenDB/Hadoop

Completely Open Source @ GitHub

Does arachnode.net scale? | Download the latest release

Regarding term/phrase extraction

rated by 0 users
Answered (Verified) This post has 1 verified answer | 2 Replies | 1 Follower

Top 10 Contributor
83 Posts
InvestisDev posted on Tue, May 28 2013 1:32 AM

Hello

We need term/phrase extraction functionality for which we tried using Lucene.Net v3.0.3 with our current Arachnode.net solution v2.6.0.  Lucene.Net v3.0.3 contains ShingleFilter funcitonality which can be used for keyword/phrase extraction.

My question here is, we had to make some changes in the Arachanode code wherever lucene.net is referenced to make it compatible and the index writing and reading works fine. Now, do you foresee any further challenges when using lucene.net 3.0.3 with Arachnode.net v2.6.0?

And if yes, can you suggest any phrase extraction functionality we can use combined with arachnode?

Awaiting your reply

Thanks

Answered (Verified) Verified Answer

Top 10 Contributor
83 Posts
Verified by arachnode.net

Too late to reply.. but anyways it worked great Yes

All Replies

Top 10 Contributor
1,905 Posts

Should be fine.  Let me know what you discover...

Look at UserDefinedFunctions.ExtractPhrases(...);

For best service when you require assistance:

  1. Check the DisallowedAbsoluteUris and Exceptions tables first.
  2. Cut and paste actual exceptions from the Exceptions table.
  3. Include screenshots.

Skype: arachnodedotnet

Top 10 Contributor
83 Posts
Verified by arachnode.net

Too late to reply.. but anyways it worked great Yes

Page 1 of 1 (3 items) | RSS
An Open Source C# web crawler with Lucene.NET search using SQL 2008/2012/CE

copyright 2004-2017, arachnode.net LLC