Creating a chunked phrase corpus