CheatSheet - Python File Operations
This short tutorial isn’t intended to cover everything in file reading and writing in Python. Instead, I just wanted to share with you some cheat sheets that I created for reading and writing files in my daily Python programming.
The code presented here is written in Python 3.7.3, and doesn’t always apply to earlier versions of Python.
We’ll focus on built-in methods that come as part of the standard package so that we don’t need to deal with installing third-party packages that are not necessarily straightforward for beginners.
Open a File
The first step in dealing with a file in Python is to open the file. To do that, you can just use the open() method, which will create a file object. Specifically, this method has a set of parameters for reading a file, but the two most used arguments are the filename and the mode.
open(file, mode='r', buffering=- 1, encoding=None, errors=None, newline=None, closefd=True, opener=None)
- Open file and return a corresponding file object. If the file cannot be opened, an OSError is raised. See Reading and Writing Files for more examples of how to use this function.
- file is a path-like object giving the pathname (absolute or relative to the current working directory) of the file to be opened or an integer file descriptor of the file to be wrapped. (If a file descriptor is given, it is closed when the returned I/O object is closed unless closefd is set to False.)
- mode is an optional string that specifies the mode in which the file is opened. It defaults to
'r'which means open for reading in text mode. Other common values are
'w'for writing (truncating the file if it already exists),
'x'for exclusive creation, and
'a'for appending (which on some Unix systems, means that all writes append to the end of the file regardless of the current seek position).
- In text mode, if encoding is not specified the encoding used is platform-dependent: locale.getpreferredencoding(False) is called to get the current locale encoding. (For reading and writing raw bytes use binary mode and leave encoding unspecified.)
Normally, files are opened in text mode, that means, you read and write strings from and to the file, which are encoded in a specific encoding. If encoding is not specified, the default is platform dependent (see open()).
'b' appended to the mode opens the file in binary mode: now the data is read and written in the form of bytes objects. This mode should be used for all files that don’t contain text.
\r\non Windows) to just
\n. When writing in text mode, the default is to convert occurrences of
\nback to platform-specific line endings. This behind-the-scenes modification to file data is fine for text files, but will corrupt binary data like that in JPEG or EXE files. Be very careful to use binary mode when reading and writing such files.
It is good practice to use the with keyword when dealing with file objects. The advantage is that the file is properly closed after its suite finishes, even if an exception is raised at some point. Using
with is also much shorter than writing equivalent try-finally blocks:
with open('workfile', encoding='utf-8') as f: read_data = f.read() # We can check that the file has been automatically closed. f.closed # output: True
f.write()without using the
withkeyword or calling
f.close()might result in the arguments of
f.write()not being completely written to the disk, even if the program exits successfully.
modes for open and read
||Opens a file for reading only. The file pointer is placed at the beginning of the file. This is the default mode.|
||Opens a file for writing only, truncating the file first. Overwrites the file if the file exists. If the file does not exist, creates a new file for writing.|
||Opens for exclusive creation, failing if the file already exists|
||Opens a file for writing/appending. The file pointer is at the end of the file if the file exists. That is, the file is in the append mode. If the file does not exist, it creates a new file for writing.|
||open for updating (reading and writing)|
||Opens a file for both reading and writing. The file pointer placed at the beginning of the file.|
||Opens a file for both appending and reading. The file pointer is at the end of the file if the file exists. The file opens in the append mode. If the file does not exist, it creates a new file for reading and writing|
||Opens a file for both writing and reading. Overwrites the existing file if the file exists. If the file does not exist, creates a new file for reading and writing.|
||text mode (default)|
||Opens a file for reading only in binary format. The file pointer is placed at the beginning of the file. This is the default mode.|
||Opens a file for both reading and writing in binary format. The file pointer placed at the beginning of the file.|
||Opens a file for writing only in binary format. Overwrites the file if the file exists. If the file does not exist, creates a new file for writing.|
||Opens a file for both writing and reading in binary format. Overwrites the existing file if the file exists. If the file does not exist, creates a new file for reading and writing.|
functions for reading and writing files
|open()||filename and open mode (optional)||create a file object by opening/creating the file in the specified read/write mode|
|with||-||use it together with open(); closes the file after the suite completes|
|read()||size (optional)||read the file up to the size specified if set|
|readline()||size (optional)||read a single line with a size limit if set|
|readlines()||size (optional)||create a list of the read lines with an optional size hint|
|for loop||-||iterate the lines of the file|
|write()||the string||write the string to the file that is opened with a certain writeable mode|
|writelines()||the list of strings||write the list of strings to the file that is opened with a certain writeable mode|
|close()||-||close the opened file|
|seek()||offset, whence=SEEK_SET||Change the stream position to the given byte offset|
|truncate()||size=None||Resize the stream to the given size in bytes (or the current position if size is not specified).|
Suppose that we have the following text file named test.txt for the purpose of this tutorial. You can create one like the below in your Python working directory if you want to try these methods while following this tutorial.
0 Start Line 0 - Line 0 End 0 1 Start Line 1 - Line 1 End 1
To open this file, we can simply call the open() method like below. We specified the filename and the open mode. As you can see from the Terminal, a file object or a text stream (i.e. TextIOWrapper) is created.
open('test.txt', 'r') <_io.TextIOWrapper name='test.txt' mode='r' encoding='UTF-8'>
The list above (i.e., cheat sheet 2) summarizes the common open modes with their intended uses. If the open mode is omitted, the default mode will be r.
To open the file in the binary mode, you need to append the letter b after the mode listed above (e.g., rb, rb+). The binary mode should be used when the file doesn’t contain text.
We can use the with keyword when a file is opened.
Using the with keyword allows the file to be properly closed automatically after the suite completes, even when an exception is raised during the process.
with open('test.txt') as file: file.read() '0 Start Line 0 - Line 0 End 0\n1 Start Line 1 - Line 1 End 1\n'
In the above code snippet, a few things are worth noting.
When the mode argument is omitted, the text file is read in the text mode and the line endings are converted to \n.
Using the with keyword, the file object is closed as expected
Read a File
As you’re probably aware, we used the read() method just now. In this method, you can specify the size that you want to read from the file object.
When the size argument is omitted or set as negative, or set as an integer greater than the file size, all the file contents will be read.
We can also specify a size smaller than the file size. In the example below, by specifying the size to be 6, only the first 6 characters will be read for the text file.
with open('test.txt') as file: file.read(6) '0 Star'
If the end of the file has been reached after reading, an empty string ('') will be returned.
For the example given below, an empty string is returned by the second read() method, as the first call has completed the reading of the entire file.
with open('test.txt') as file: file.read() file.read() '0 Start Line 0 - Line 0 End 0\n1 Start Line 1 - Line 1 End 1\n' ''
We can also read individual lines by using the readline() method. This method will read the text for the entire line including the line ending (i.e., \n).
However, the \n will be omitted for the last line of the file, if the file doesn’t end in a newline. When reading has reached the end of the file, an empty string will be returned.
If you call the readline() method multiple times, the reading will be continued at where it was read last time.
with open('test.txt') as file: file.readline() file.readline() file.readline() '0 Start Line 0 - Line 0 End 0\n' '1 Start Line 1 - Line 1 End 1\n' ''
Actually, the readline() method can also take in a size parameter, which will be used as a limit for reading the line.
In the code snippet below, readline(2) and readline(5) will read just 2 and 5 characters, respectively.
Again, reading is continued at where it was read last time.
with open('test.txt') as file: file.readline() file.readline(2) file.readline(5) file.readline() file.readline() '0 Start Line 0 - Line 0 End 0\n' '1 ' 'Start' ' Line 1 - Line 1 End 1\n' ''
One more thing to note is that a blank line is returned as '\n' — a string containing only a single new line.
If you want to read all the lines with a single method, you can use readlines(), which will create a list of individual lines.
with open('test.txt') as file: file.readlines() ['0 Start Line 0 - Line 0 End 0\n', '1 Start Line 1 - Line 1 End 1\n']
The readlines() method can also optionally set a size hint.
When it’s set as a positive integer, it will read that many characters (or bytes in the binary mode) from the file and enough to complete that line.
In our case, our first line is 29 characters. Thus, a size of 1 to 29 will lead to the reading of the first line, while 30 or greater will read both lines, as shown below.
with open('test.txt') as file: file.readlines() ['0 Start Line 0 - Line 0 End 0\n', '1 Start Line 1 - Line 1 End 1\n'] with open('test.txt') as file: file.readlines(29) ['0 Start Line 0 - Line 0 End 0\n'] with open('test.txt') as file: file.readlines(30) ['0 Start Line 0 - Line 0 End 0\n', '1 Start Line 1 - Line 1 End 1\n']
Write a File
To write new values (e.g., text) to the file, we can use the write() method.
However, how the new values will be written to the file is dependent on the open mode as summarized in cheat sheet 2.
For example, the w mode will truncate the file, and thus the file will only contain the new values. The a mode will allow you to append new values to the existing file.
with open('test.txt', 'w') as file: file.write('This is a writing method.') 25
~] cat test.txt This is a writing method
The above code writes the specified string to the file after truncating the text before it was opened.
A side note is that the write() method will automatically print the number of characters written, which can be suppressed by assigning the returned value to an underscore:
_ = file.write(‘str to be written’).
After the writing action, you can see the current text of the file. As expected, only the new string is contained in the file.
>>> with open('test.txt') as file: file.read() 'This is a writing method.'
We can also use the writelines() method to write multiple lines to the file.
This method will take in a list of strings as the parameter. How the lines are written (e.g., overwriting or appending) using this method is similar to the implementation of the write() method.
with open('test.txt', 'w') as file: file.writelines(['Line 0\n', 'Line 1']) with open('test.txt') as file: file.read() 'Line 0\nLine 1'
~] cat test.txt Line 0 Line 1
After executing the above code, the file’s text becomes 'Line 0\nLine 1' as read by the default read() method.
Close a File
After you’re done working with the file, it’s always good practice to close the file by calling the close() method on the file object.
As shown below, before the file is closed, the closed method returns False, and it returns True after we call the close() method.
>>> file = open('test.txt') >>> file.closed False >>> file.close() >>> file.closed True
Change the file object’s position
Change the stream position to the given offset. Behaviour depends on the whence parameter. The default value for whence is SEEK_SET. Return the new absolute position as an opaque number.
- SEEK_SET or 0: seek from the start of the stream (the default); offset must either be a number returned by TextIOBase.tell(), or zero. Any other offset value produces undefined behaviour.
- SEEK_CUR or 1: "seek" to the current position; offset must be zero, which is a no-operation (all other values are unsupported).
- SEEK_END or 2: seek to the end of the stream; offset must be zero (all other values are unsupported).
or in another words, whence mean:
- 0: means your reference point is the beginning of the file
- 1: means your reference point is the current file position
- 2: means your reference point is the end of the file
bin the mode string), only seeks relative to the beginning of the file are allowed (the exception being seeking to the true file end with
seek(0, 2))and the only valid offset values are those returned from the f.tell(), or zero. Any other offset value produces undefined behaviour.
f = open('workfile', 'wb+') f.write(b'0123456789abcdef') # output: 16 f.seek(5) # Go to the 6th byte in the file # output: 5 f.read(1) # output: b'5' f.seek(-3, 2) # Go to the 3rd byte before the end # output: 13 f.read(1) # output: b'd' f.seek(0, 0) # Go to the beginning of file # output: 0 f.read() # output: b'0123456789abcdef'