Lucene's 3.4.0
release adds a new feature called 
index-time join (also
sometimes called sub-documents, nested documents or parent/child documents),
enabling efficient indexing and searching of certain types
of 
relational
content.
 Most search engines can't directly index relational content,
as documents in the index logically behave like a single flat database
table.  Yet, relational content is everywhere!  A job listing site has
each company joined to the specific listings for that company.  Each
resume might have separate list of skills, education and past work
experience.  A music search engine has an artist/band joined to albums
and then joined to songs.  A source code search engine would have
projects joined to modules and then files.
Perhaps the PDF documents you need to search are immense, so you break
them up and index each section as a separate Lucene document; in this
case you'll have common fields (title, abstract, author, date
published, etc.) for the overall document, joined to the sub-document
(section) with its own fields (text, page number, etc.).  XML
documents typically contain nested tags, representing joined
sub-documents; emails have attachments; office documents can embed
other documents.  Nearly all search domains have some form of
relational content, often requiring more than one join.
If such content is so common then how do search applications handle it
today?
One obvious "solution" is to simply use a relational database instead
of a search engine!  If relevance scores are less important and you
need to do substantial joining, grouping, sorting, etc., then using a
database could be best overall.  Most databases include some form a
text search, some even using Lucene.
If you still want to use a search engine, then one common approach
is to
denormalize
the content up front, at index-time, by joining all tables and
indexing the resulting rows, duplicating content in the process.  For
example, you'd index each song as a Lucene document, copying over all
fields from the song's joined album and artist/band.  This works
correctly, but can be horribly wasteful as you are indexing identical
fields, possibly including large text fields, over and over.
Another approach is to do the join yourself, outside of
Lucene, by indexing songs, albums and artist/band as separate Lucene
documents, perhaps even in separate indices.  At search-time, you
first run a query against one collection, for example the songs.  Then
you iterate through 
all hits, gathering up (joining) the full
set of corresponding albums and then run a second query against the
albums, with a large OR'd list of the albums from the first query,
repeating this process if you need to join to artist/band as well.
This approach will also work, but doesn't scale well as you may have
to create possibly immense follow-on queries.
Yet another approach is to use a software package that has
already implemented one of these
approaches for you!  
elasticsearch,
Apache
  Solr, 
Apache
  Jackrabbit, 
Hibernate
  Search and many others all handle relational content in some way.
With 
BlockJoinQuery you can now directly search
relational content yourself!
Let's work through a simple example: imagine you sell shirts online.
Each shirt has certain common fields such as name, description,
fabric, price, etc.  For each shirt you have a number of
separate 
stock
keeping units or SKUs, which have their own fields like size,
color, inventory count, etc.  The SKUs are what you actually sell, and
what you must stock, because when someone buys a shirt they buy a
specific SKU (size and color).
Maybe you are 
lucky enough to sell the incredible
Mountain Three-wolf Moon Short Sleeve Tee, with these SKUs (size, color):
  -  small, blue
  
-  small, black
  
-  medium, black
  
-  large, gray
Perhaps a user first searches for "wolf shirt", gets a bunch of hits,
and then drills down on a particular size and color, resulting in this
query:
   name:wolf AND size=small AND color=blue
which should match this shirt. 
name is a shirt field
while the 
size and 
color are SKU fields.
But if the user drills down instead on a small gray shirt:
   name:wolf AND size=small AND color=gray
then this shirt should not match because the small size only comes in
blue and black.
 How can you run these queries
using 
BlockJoinQuery?  Start by indexing each shirt
(parent) and all of its SKUs (children) as separate documents, using
the new 
IndexWriter.addDocuments API to add one shirt and
all of its SKUs as a single 
document block.  This method
atomically adds a block of documents into a single segment as adjacent
document IDs, which 
BlockJoinQuery relies on. You should
also add a marker field to each shirt document (e.g. 
type =
shirt), as 
BlockJoinQuery requires
a 
Filter identifying the parent documents.
To run a 
BlockJoinQuery at search-time, you'll first need
to create the 
parent filter, matching only shirts.
Note that the filter must use 
FixedBitSet
under the hood, like 
CachingWrapperFilter:
  Filter shirts = new CachingWrapperFilter(
                    new QueryWrapperFilter(
                      new TermQuery(
                        new Term("type", "shirt"))));
Create this filter once, up front and re-use it any time you need to
perform this join.
Then, for each query that requires a join, because it involves
both SKU and shirt fields, start with the child query matching only
SKU fields:
  BooleanQuery skuQuery = new BooleanQuery();
  skuQuery.add(new TermQuery(new Term("size", "small")), Occur.MUST);
  skuQuery.add(new TermQuery(new Term("color", "blue")), Occur.MUST);
Next, use 
BlockJoinQuery to translate hits from the SKU
document space up to the shirt document space:
  BlockJoinQuery skuJoinQuery = new BlockJoinQuery(
    skuQuery, 
    shirts,
    ScoreMode.None);
The 
ScoreMode enum decides how scores for multiple SKU
hits should be aggregated to the score for the corresponding shirt
hit.  In this query you don't need scores from the SKU matches, but if
you did you can aggregate
with 
Avg, 
Max or 
Total instead.
Finally you are now free to build up an arbitrary shirt query
using 
skuJoinQuery as a clause:
  BooleanQuery query = new BooleanQuery();
  query.add(new TermQuery(new Term("name", "wolf")), Occur.MUST);
  query.add(skuJoinQuery, Occur.MUST);
You could also just run 
skuJoinQuery as-is if the query
doesn't have any shirt fields.
Finally, just run this 
query like normal!  The
returned hits will be only shirt documents; if you'd also like to see
which SKUs  matched for each shirt,
use 
BlockJoinCollector:
  BlockJoinCollector c = new BlockJoinCollector(
    Sort.RELEVANCE, // sort
    10,             // numHits
    true,           // trackScores
    false           // trackMaxScore
    );
  searcher.search(query, c);
The provided 
Sort must use only shirt fields (you cannot
sort by any SKU fields).  When each hit (a shirt) is competitive, this
collector will also record all SKUs that matched for that shirt, which
you can retrieve like this:
  TopGroups hits = c.getTopGroups(
    skuJoinQuery,
    skuSort,
    0,   // offset
    10,  // maxDocsPerGroup
    0,   // withinGroupOffset
    true // fillSortFields
  );
Set 
skuSort to the sort order for the SKUs within each
shirt.  The first 
offset hits are skipped (use this for
paging through shirt hits).  Under each shirt, at
most 
maxDocsPerGroup SKUs will be returned.
Use 
withinGroupOffset if you want to page within the
SKUs.  If 
fillSortFields is true then each SKU hit will
have values for the fields from 
skuSort.
The hits returned by 
BlockJoinCollector.getTopGroups
are SKU hits, grouped by shirt.  You'd get the exact same results if
you had denormalized up-front and then used grouping to group results
by shirt.
You can also do more than one join in a single query; the joins can be
nested (parent to child to grandchild) or parallel (parent to child1
and parent to child2).
However, there are some important limitations of index-time joins:
  -  The join must be computed at index-time and "compiled" into the
    index, in that all joined child documents must be indexed along
    with the parent document, as a single document block.
    
 
 
-  Different document types (for example, shirts and SKUs) must
    share a single index, which is wasteful as it means non-sparse
    data structures like FieldCacheentries consume more
    memory than they would if you had separate indices.
 
 
-  If you need to re-index a parent document or any of its child
    documents, or delete or add a child, then the entire block must be
    re-indexed.  This is a big problem in some cases, for example if
    you index "user reviews" as child documents then whenever a user
    adds a review you'll have to re-index that shirt as well as all
    its SKUs and user reviews.
    
 
 
-  There is no QueryParsersupport, so you need to
    programmatically create the parent and child queries,
    separating according to parent and child fields.
 
 
-  The join can currently only go in one direction (mapping child
    docIDs to parent docIDs), but in some cases you need to map parent
    docIDs to child docIDs.  For example, when searching songs,
    perhaps you want all matching songs sorted by their title.  You
    can't easily do this today because the only way to get song hits
    is to group by album or band/artist.
    
 
 
-  The join is a one (parent) to many (children), inner join.
As usual, patches are welcome!
There
is 
work
underway to create a more flexible, but likely less performant,
query-time join capability, which should address a number of the above
limitations.