Monday, May 14, 2012

Finite State Automata in Lucene

Lucene Revolution 2012 is now done, and the talk Robert and I gave went well! We showed how we are using automata (FSAs and FSTs) to make great improvements throughout Lucene.

You can view the slides here.

This was the first time I used Google Docs exclusively for a talk, and I was impressed! The real-time collaboration was awesome: we each could see the edits the other was doing, live. You never have to "save" your document: instead, every time you make a change, the document is saved to a new revision and you can then use infinite undo, or step back through all revisions, to go back.

Finally, Google Docs covers the whole life-cycle of your talk: editing/iterating, presenting (it presents in full-screen just fine, but does require an internet connection; I exported to PDF ahead of time as a backup) and, finally, sharing with the rest of the world!


  1. This comment has been removed by the author.

  2. Hey Mike, I'm learning a ton from your blog. I thought I'd let you know I posted a Lucene FSA Tutorial on our company's blog that your readers might be interested in.

  3. Hi Doug, that's a nice tutorial! Thanks for sharing.

  4. Hi Mike, I know it is old article, but I have this question. When you say FST, do you mean Block trees(burst tree DS) terms dictionary in lucene code?

    1. The block tree terms dictionary is just one of many places where we use FSTs in Lucene now. Try checkout out the source code and then grep'ing for org.apache.lucene.util.fst ...