arachnode.net
An Open Source C# web crawler with Lucene.NET search using SQL Server 2008/2012/2014/2016/CE An Open Source C# web crawler with Lucene.NET search using MongoDB/RavenDB/Hadoop

Completely Open Source @ GitHub

Does arachnode.net scale? | Download the latest release

About Index files creating

rated by 0 users
Answered (Verified) This post has 1 verified answer | 1 Reply | 2 Followers

Top 10 Contributor
83 Posts
InvestisDev posted on Thu, Sep 13 2012 3:19 AM

Hello Mike,

as we are crawling one site and created index files starting name like

segments_2, _0.fdt, _0.fdx .... etc.

Now again if crawled and files are created like 

segments_3, _1.fdt, _1.fdx .... etc.

now if we have both of the above files in one folder and while loading the search results, which will be considered giving results?

Let me know if you need any further detail for the same.

 

Thanks,

Answered (Verified) Verified Answer

Top 10 Contributor
1,905 Posts
Verified by InvestisDev

Should be all of them.

When you stop the second crawl, the first files should be merged.  Additional files may be created when deleting documents so you have the option of committing your changes.  (or not)

Mike

For best service when you require assistance:

  1. Check the DisallowedAbsoluteUris and Exceptions tables first.
  2. Cut and paste actual exceptions from the Exceptions table.
  3. Include screenshots.

Skype: arachnodedotnet

Page 1 of 1 (2 items) | RSS
An Open Source C# web crawler with Lucene.NET search using SQL 2008/2012/CE

copyright 2004-2017, arachnode.net LLC