At the start of the course, we learned how to manipulate strings, and how to read/write files. In this lecture, we go over a few useful features of Python that make it easier to deal with lists of files, as well as formatting data into strings (which can be useful for e.g. constructing filenames or writing data)
glob
module¶In the Linux command-line, it is possible to list multiple files matching a pattern with e.g.:
$ ls *.py
This means list all files ending in .py
.
The built-in glob module allows you to do something similar from Python. The only important function here in the glob
module is also called glob
.
This function can be given a pattern (such as *.py
) and will return a list of filenames that match:
import glob
glob.glob('*.ipynb')
os
module¶The os module allows you to interact with the system, and also contains utilities to construct or analyse file paths. The os.path
sub-module is particularly useful for accessing files - for example,
import os
os.path.exists('test.py')
can be used to find out if a file exists.
When constructing the path to a file, for example data/file.txt
, one normally has to worry about whether this file is a Linux/Mac or a Windows file path (since Linux/Mac use /
and Windows uses \
). However, the os
module allows you to construct file paths without worrying about this:
os.path.join('data', 'file.txt')
This can be combined with glob, for example:
glob.glob(os.path.join('data', '*.txt'))
The os
module also has other useful functions which you can find about from the documentation.
The os.path.getsize
function can be used to find the size of a file in bytes. Do a loop over all the files in the current directory using glob
and for each one, print out the filename and the size in kilobytes (1024 bytes):
# your solution here