Sometimes you want to collect “paragraphs”–contiguous runs of text lines
that are delineated by blank lines. Markdown and RST document formats,
for example, use this convention.
textdata has a
paras routine to
extract such paragraphs:
>>> rhyme = """ Hey diddle diddle, The cat and the fiddle, The cow jumped over the moon. The little dog laughed, To see such sport, And the dish ran away with the spoon. """ >>> paras(rhyme) [['Hey diddle diddle,'], ['The cat and the fiddle,', 'The cow jumped over the moon.', 'The little dog laughed,', 'To see such sport,'], ['And the dish ran away with the spoon.']]
Or if you’d like paras, but each paragraph in a single string:
>>> paras(rhyme, join="\n") ['Hey diddle diddle,', 'The cat and the fiddle,\nThe cow jumped over the moon.\nThe little dog laughed,\nTo see such sport,', 'And the dish ran away with the spoon.']
join to a space will of course
concatenate the lines of each paragraph with
a space. This can be useful for converting from line-oriented paragraphs
into each-paragraph as a (potentially very long) single line, a format
useful for cut-and-pasting into many editors and text entry boxes on the
Web or for email systems.
On the off chance you want to preserve the exact intra-paragraph spacing,
keep_blanks=True will accomplish that.