Lucene CorruptIndexException (checksum failed) on GridLuceneIndex - suggested patch

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Lucene CorruptIndexException (checksum failed) on GridLuceneIndex - suggested patch

Manu
Hi,

GridLuceneOutputStream has a bug on copyBytes method and
GridLuceneInputStream on readBytes method for direct calls from
GridLuceneOutputStream, since version in which ignite was updated to lucene
5.5.2:

since commit 478d3b5d3361c3d74d0da4b6a78e9944d8b95630
IGNITE-3562: Updated Lucene dependency to version 5.5.2. This closes #1987.

On both methods internal GridLuceneOutputStream's CRC is not updated, so we
get  /org.apache.lucene.index.CorruptIndexException: checksum failed
(hardware problem?) [...]/ when the use of lucene internally try to merge
it.

To reproduce:
1 - Create a cache with a query entity with at least one @QueryTextField
String field
2 - Insert data (for example 50kb aleatory string, bigger string make fails
sooner) on cache (put or by stream) with a loop
3 - waits (no more than 1 minute, depends on your computer) until lucene try
to internal merge index

Suggested patch to fix CorruptIndexException on GridLuceneIndex
FIX-IGNITE-LUCENE-STREAM-CRC.patch
<http://apache-ignite-developers.2346864.n4.nabble.com/file/t242/FIX-IGNITE-LUCENE-STREAM-CRC.patch>  

Hope it helps!!

Bye!

Manu



--
Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Lucene CorruptIndexException (checksum failed) on GridLuceneIndex - suggested patch

Vladimir Ozerov
Hi Andrey,

I know you helped to migrate AI to newer Lucene version. Could you please
take a look at the patch?

Vladimir.

On Fri, Apr 6, 2018 at 12:02 PM, Manu <[hidden email]> wrote:

> Hi,
>
> GridLuceneOutputStream has a bug on copyBytes method and
> GridLuceneInputStream on readBytes method for direct calls from
> GridLuceneOutputStream, since version in which ignite was updated to lucene
> 5.5.2:
>
> since commit 478d3b5d3361c3d74d0da4b6a78e9944d8b95630
> IGNITE-3562: Updated Lucene dependency to version 5.5.2. This closes #1987.
>
> On both methods internal GridLuceneOutputStream's CRC is not updated, so we
> get  /org.apache.lucene.index.CorruptIndexException: checksum failed
> (hardware problem?) [...]/ when the use of lucene internally try to merge
> it.
>
> To reproduce:
> 1 - Create a cache with a query entity with at least one @QueryTextField
> String field
> 2 - Insert data (for example 50kb aleatory string, bigger string make fails
> sooner) on cache (put or by stream) with a loop
> 3 - waits (no more than 1 minute, depends on your computer) until lucene
> try
> to internal merge index
>
> Suggested patch to fix CorruptIndexException on GridLuceneIndex
> FIX-IGNITE-LUCENE-STREAM-CRC.patch
> <http://apache-ignite-developers.2346864.n4.nabble.
> com/file/t242/FIX-IGNITE-LUCENE-STREAM-CRC.patch>
>
> Hope it helps!!
>
> Bye!
>
> Manu
>
>
>
> --
> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
>
Reply | Threaded
Open this post in threaded view
|

Re: Lucene CorruptIndexException (checksum failed) on GridLuceneIndex - suggested patch

Andrey Mashenkov
Hi Vladimir,

Patch looks good.
I'll try to make a test for the case described and prepare a PR.

On Mon, Apr 9, 2018 at 11:58 AM, Vladimir Ozerov <[hidden email]>
wrote:

> Hi Andrey,
>
> I know you helped to migrate AI to newer Lucene version. Could you please
> take a look at the patch?
>
> Vladimir.
>
> On Fri, Apr 6, 2018 at 12:02 PM, Manu <[hidden email]> wrote:
>
>> Hi,
>>
>> GridLuceneOutputStream has a bug on copyBytes method and
>> GridLuceneInputStream on readBytes method for direct calls from
>> GridLuceneOutputStream, since version in which ignite was updated to
>> lucene
>> 5.5.2:
>>
>> since commit 478d3b5d3361c3d74d0da4b6a78e9944d8b95630
>> IGNITE-3562: Updated Lucene dependency to version 5.5.2. This closes
>> #1987.
>>
>> On both methods internal GridLuceneOutputStream's CRC is not updated, so
>> we
>> get  /org.apache.lucene.index.CorruptIndexException: checksum failed
>> (hardware problem?) [...]/ when the use of lucene internally try to merge
>> it.
>>
>> To reproduce:
>> 1 - Create a cache with a query entity with at least one @QueryTextField
>> String field
>> 2 - Insert data (for example 50kb aleatory string, bigger string make
>> fails
>> sooner) on cache (put or by stream) with a loop
>> 3 - waits (no more than 1 minute, depends on your computer) until lucene
>> try
>> to internal merge index
>>
>> Suggested patch to fix CorruptIndexException on GridLuceneIndex
>> FIX-IGNITE-LUCENE-STREAM-CRC.patch
>> <http://apache-ignite-developers.2346864.n4.nabble.com/file/
>> t242/FIX-IGNITE-LUCENE-STREAM-CRC.patch>
>>
>> Hope it helps!!
>>
>> Bye!
>>
>> Manu
>>
>>
>>
>> --
>> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Lucene CorruptIndexException (checksum failed) on GridLuceneIndex - suggested patch

Andrew Mashenkov
Guys,

I've create a ticker for this [1].
Looks like crc update was really missed, but I can't reproduce the issue.
I've tried different cases with no luck: putting\updating various large
strings in single\multiple threads.


[1] https://issues.apache.org/jira/browse/IGNITE-8175

On Mon, Apr 9, 2018 at 12:15 PM, Andrey Mashenkov <[hidden email]>
wrote:

> Hi Vladimir,
>
> Patch looks good.
> I'll try to make a test for the case described and prepare a PR.
>
> On Mon, Apr 9, 2018 at 11:58 AM, Vladimir Ozerov <[hidden email]>
> wrote:
>
> > Hi Andrey,
> >
> > I know you helped to migrate AI to newer Lucene version. Could you please
> > take a look at the patch?
> >
> > Vladimir.
> >
> > On Fri, Apr 6, 2018 at 12:02 PM, Manu <[hidden email]> wrote:
> >
> >> Hi,
> >>
> >> GridLuceneOutputStream has a bug on copyBytes method and
> >> GridLuceneInputStream on readBytes method for direct calls from
> >> GridLuceneOutputStream, since version in which ignite was updated to
> >> lucene
> >> 5.5.2:
> >>
> >> since commit 478d3b5d3361c3d74d0da4b6a78e9944d8b95630
> >> IGNITE-3562: Updated Lucene dependency to version 5.5.2. This closes
> >> #1987.
> >>
> >> On both methods internal GridLuceneOutputStream's CRC is not updated, so
> >> we
> >> get  /org.apache.lucene.index.CorruptIndexException: checksum failed
> >> (hardware problem?) [...]/ when the use of lucene internally try to
> merge
> >> it.
> >>
> >> To reproduce:
> >> 1 - Create a cache with a query entity with at least one @QueryTextField
> >> String field
> >> 2 - Insert data (for example 50kb aleatory string, bigger string make
> >> fails
> >> sooner) on cache (put or by stream) with a loop
> >> 3 - waits (no more than 1 minute, depends on your computer) until lucene
> >> try
> >> to internal merge index
> >>
> >> Suggested patch to fix CorruptIndexException on GridLuceneIndex
> >> FIX-IGNITE-LUCENE-STREAM-CRC.patch
> >> <http://apache-ignite-developers.2346864.n4.nabble.com/file/
> >> t242/FIX-IGNITE-LUCENE-STREAM-CRC.patch>
> >>
> >> Hope it helps!!
> >>
> >> Bye!
> >>
> >> Manu
> >>
> >>
> >>
> >> --
> >> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
> >>
> >
> >
>



--
Best regards,
Andrey V. Mashenkov
Reply | Threaded
Open this post in threaded view
|

Re: Lucene CorruptIndexException (checksum failed) on GridLuceneIndex - suggested patch

Andrew Mashenkov
In reply to this post by Manu
Hi Manu,

We've fixed CRC issue, but still can't make reproducer.
Seems, we've missed smth.

Can you share a repro?



--
Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/