tag:blogger.com,1999:blog-8623074010562846957.post3229532251165863380..comments2023-09-01T03:38:08.236-04:00Comments on Changing Bits: Lucene's indexing is fast!Michael McCandlesshttp://www.blogger.com/profile/04277432937861334672noreply@blogger.comBlogger3125tag:blogger.com,1999:blog-8623074010562846957.post-87527986579584022712011-04-01T10:13:06.726-04:002011-04-01T10:13:06.726-04:00If you interested how LUCENE-2324 turned out in it...If you interested how LUCENE-2324 turned out in its first benchmark I blogged about it today here: http://blog.jteam.nl/2011/04/01/gimme-all-resources-you-have-i-can-use-them/simonwhttps://www.blogger.com/profile/10010145250453142651noreply@blogger.comtag:blogger.com,1999:blog-8623074010562846957.post-63046378127137195442010-09-17T10:35:07.564-04:002010-09-17T10:35:07.564-04:00We should try that!
With flexible indexing it sho...We should try that!<br /><br />With flexible indexing it should be simple to make a codec that switches the underlying encoding (similar to how Pulsing codec works), eg based on doc count in the segment.<br /><br />I also want to make codec that decides term by term which encoding to use -- pulsing for very low freq terms, maybe standard for medium freq, and FOR/PFOR for high freq.Michael McCandlesshttps://www.blogger.com/profile/04277432937861334672noreply@blogger.comtag:blogger.com,1999:blog-8623074010562846957.post-69567359280334088042010-09-17T08:14:32.799-04:002010-09-17T08:14:32.799-04:00In addition to the flush issue, I wonder if we sho...In addition to the flush issue, I wonder if we shouldn't use a less expensive encoding for segments smaller than X docs (or N bytes), maybe even write much of the data uncompressed?Andrzejhttps://www.blogger.com/profile/11220441775032770337noreply@blogger.com