Words¶
Often the data you need to encode is almost, but not quite, a series of words. A list of names, a list of color names–values that are mostly single words, but sometimes have an embedded spaces.
>>> words(' Billy Bobby "Mr. Smith" "Mrs. Jones" ')
['Billy', 'Bobby', 'Mr. Smith', 'Mrs. Jones']
Embedded quotes (either single or double) can be used to construct “words” (really, phrases) containing whitespace (including tabs and newlines).
words
isn’t a full parser, so there are some extreme cases like
arbitrarily nested quotations that it can’t handle. It isn’t confused,
however, by embedded apostrophes and other common gotchas.
>>> words("don't be blue")
["don't", "be", "blue"]
>>> words(""" "'this'" works '"great"' """)
["'this'", 'works', '"great"']
words
is a good choice for situations where you want a compact,
friendly, whitespace-delimited data representation–but a few of your
entries need more than just str.split()
.
Explicit Separators¶
There is a second mode of operation for words
in which
you provide explicit separators. This is handy if, for example,
you have a number of phrases with embedded spaces. This happens often
when importing data from spreadsheets.
>>> words('First Name / Last Name / Age / Best Feature', sep='/')
['First Name', Last Name', 'Age', 'Best Feature']
Here you have a very terse specification of the phrases, without the need to quote in order to preserve embedded spaces.