python tutorial - Python Files And Os.Path - learn python - python programming
Directories
- The module called os contains functions to get information on local directories, files, processes, and environment variables.
os.getcwd()
- The current working directory is a property that Python holds in memory at all times.
- There is always a current working directory, whether we're in the Python Shell, running our own Python script from the command line, etc.
- We used the os.getcwd() function to get the current working directory.
- When we run the graphical Python Shell, the current working directory starts as the directory where the Python Shell executable is. On Windows, this depends on where we installed Python; the default directory is c:\Python32.
- If we run the Python Shell from the command line, the current working directory starts as the directory we were in when we ran python3.
- Then, we used the os.chdir() function to change the current working directory.
- Note that when we called the os.chdir() function, we used a Linux-style pathname (forward slashes, no drive letter) even though we're on Windows.
- This is one of the places where Python tries to paper over the differences between operating systems.
os.path.join()
- os.path contains functions for manipulating filenames and directory names.
- The os.path.join() function constructs a pathname out of one or more partial pathnames.
- In this case, it simply concatenates strings. Calling the os.path.join() function will add an extra slash to the pathname before joining it to the filename.
- The os.path.expanduser() function will expand a pathname that uses ~ to represent the current user's home directory.
- This works on any platform where users have a home directory, including Linux, Mac OS X, and Windows.
- The returned path does not have a trailing slash, but the os.path.join() function doesn't mind.
- Combining these techniques, we can easily construct pathnames for directories and files in the user's home directory. The os.path.join() function can take any number of arguments.
- Note: we need to be careful about the string when we use os.path.join.
- If we use "/", it tells Python that we're using absolute path, and it overrides the path before it:
os.path.split()
- os.path also contains functions to split full pathnames, directory names, and filenames into their constituent parts.
- The split() function splits a full pathname and returns a tuple containing the path and filename.
- The os.path.split() function does return multiple values. We assign the return value of the split function into a tuple of two variables. Each variable receives the value of the corresponding element of the returned tuple.
- The first variable, dirname, receives the value of the first element of the tuple returned from the os.path.split() function, the file path.
- The second variable, filename, receives the value of the second element of the tuple returned from the os.path.split() function, the filename.
- os.path also contains the os.path.splitext() function, which splits a filename and returns a tuple containing the filename and the file extension.
- We used the same technique to assign each of them to separate variables.
glob.glob()
- The glob module is another tool in the Python standard library.
- It's an easy way to get the contents of a directory programmatically, and it uses the sort of wildcards that we may already be familiar with from working on the command line.
- The glob module takes a wildcard and returns the path of all files and directories matching the wildcard.
File metadata
- Every file system stores metadata about each file: creation date, last-modified date, file size, and so on.
- Python provides a single API to access this metadata. We don't need to open the file and all we need is the filename.
- Calling the os.stat() function returns an object that contains several different types of metadata about the file. st_mtime is the modification time, but it's in a format that isn't terribly useful.
- Actually, it's the number of seconds since the Epoch, which is defined as the first second of January 1st, 1970.
- The time module is part of the Python standard library. It contains functions to convert between different time representations, format time values into strings, and fiddle with timezones.
- The time.localtime() function converts a time value from seconds-since-the-Epoch (from the st_mtime property returned from the os.stat() function) into a more useful structure of year, month, day, hour, minute, second, and so on.
- This file was last modified on Feb 2, 2013, at around 9:12 PM.
- The os.stat() function also returns the size of a file, in the st_size property. The file "test1.py" is 1844 bytes.
os.path.realpath() - Absolute pathname
- The glob.glob() function returned a list of relative pathnames. If weu want to construct an absolute pathname - i.e. one that includes all the directory names back to the root directory or drive letter - then we'll need the os.path.realpath() function.
os.path.expandvars() - Env. variable
- The expandvars function inserts environment variables into a filename.
Opening Files
- To open a file, we use built-in open() function:
- The open() function takes a filename as an argument. Here the filename is mydir/myfile.txt, and the next argument is a processing mode.
- The mode is usually the string 'r' to open text input (this is the default mode), 'w' to create and open open for text output.
- The string 'a' is to open for appending text to the end. The mode argument can specify additional options: adding a 'b' to the mode string allows for binary data, and adding a + opens the file for both input and output.
- The table below lists several combination of the processing modes:
- There are things we should know about the filename:
- It's not just the name of a file. It's a combination of a directory path and a filename. In Python, whenever we need a filename, we can include some or all of a directory path as well.
- The directory path uses a forward slash without mentioning operating system. Windows uses backward slashes to denote subdirectories, while Linux use forward slashes. But in Python, forward slashes always work, even on Windows.
- The directory path does not begin with a slash or a drive letter, so it is called a relative path.
- It's a string. All modern operating systems use Unicode to store the names of files and directories. Python 3 fully supports non-ASCII pathnames.