These files are all examples of “Comma Separated Value (CSV) files”.
These files are typically created by exporting from spreadsheets or databases, often for import into other
spreadsheets or databases.
CSV files are files where each line has the same structure, consisting of a number of values (called “fields”)
separated by some specified character (or sequence of characters), typically a comma (hence the name
“comma separated values”). The character (or characters) that separates the fields is called a “delimiter”.
There may be spaces after (or before) the delimiter to separate the fields, or the fields may be “padded out”
with spaces to make them all have the same number of characters. (The number of characters in a field is
called the field’s “width”.)
Obviously, we could write our own functions for making sense of (parsing) every arrangement of data we
come across, but there is an easier way for this large class of text files. We use one of the standard Python
modules: the csv module.
Compare data1.csv and data2.csv. The file data2.csv doesn’t even use commas but TABs as its
delimiter. Also, in CSV files, data with spaces in it is often “quoted” (surrounded by quotation marks) to make
clear that it is one single item of data and should be treated as such. In Python this is not necessary unless
you are using a space as your delimiter, but you will often find that programs that produce CSV files
automatically quote data with spaces in it.
Look at data4.csv. If you want, you can quote all the data in the file, or all the text data. Python doesn’t
mind if you quote data even when it is not strictly necessary.
Look at von Hayek’s entry in weird4.csv. If your data contains special characters, such as the delimiter or
a new line (‘
’) character, then you will need to quote that data or Python will get confused when it reads
the CSV file.