python tutorial - Python - classes and instances (__INIT__, __CALL__, ETC.) - learn python - python programming
Classes and Instances
- Unlike C++, classes in Python are objects in their own right, even without instances. They are just self-contained namespaces.
- Therefore, as long as we have a reference to a class, we can set or change its attributes anytime we want.
Defining Classes
The following statement makes a class with no attributes attached, and in fact, it's an empty namespace object:
- The name of this class is Student, and it doesn't inherit from any other class. Class names are usually capitalized, but this is only a convention, not a requirement.
- Everything in a class is indented, just like the code within a function, if statement, for loop, or any other block of code. The first line not indented is outside the class.
- In the code, the pass is the no-operation statement. This Student class doesn't define any methods or attributes, but syntactically, there needs to be something in the definition, thus the pass statement.
- This is a Python reserved word that just means move along, nothing to see here.
- It's a statement that does nothing, and it's a good placeholder when we're stubbing out functions or classes. The pass statement in Python is like an empty set of curly braces {} in Java or C.
- Then, we attached attributes to the class by assigning name to it outside of the class. In this case, the class is basically an objec with field names attached to it.
Note that this is working even though there are no instances of the class yet.
The __init__() method
- Many classes are inherited from other classes, but the one in the example is not. Many classes define methods, but this one does not.
- There is nothing that a Python class absolutely must have, other than a name. In particular, C++ programmers may find it odd that Python classes don't have explicit constructors and destructors.
- Although it's not required, Python classes can have something similar to a constructor: the __init__() method.
In Python, objects are created in two steps:
- Constructs an object
__new()__ - Initializes the object
__init()__
However, it's very rare to actually need to implement __new()__ because Python constructs our objects for us. So, in most of the cases, we usually only implement the special method, __init()__.
- Let's create a class that stores a string and a number:
- When a def appears inside a class, it is usually known as a method.
- It automatically receives a special first argument, self, that provides a handle back to the instance to be processed. Methods with two underscores at the start and end of names are special methods.
- The __init__() method is called immediately after an instance of the class is created. It would be tempting to call this the constructor of the class.
- It's really tempting, because it looks like a C++ constructor, and by convention, the __init__() method is the first method defined for the class.
- It appears to be acting like a constructor because it's the first piece of code executed in a newly created instance of the class.
- However, it's not like a constructor, because the object has already been constructed by the time the __init()__ method is called, and we already have a valid reference to the new instance of the class.
- The first parameter of __init()__ method, self, is equivalent to this of C++. Though we do not have to pass it since Python will do it for us, we must put self as the first parameter of nonstatic methods. But the self is always explicit in Python to make attribute access more obvious.
- The self is always a reference to the current instance of the class. Though this argument fills the role of the reserved word this in c++ or Java, but self is not a reserved word in Python, merely a naming convention. Nonetheless, please don't call it anything but self; this is a very strong convention.
When a method assigns to a self attribute, it creates an attribute in an instance because self refers to the instance being processed.
Instantiating classes
- To instantiate a class, simply call the class as if it were a function, passing the arguments that the __init__() method requires.
- The return value will be the newly created object. In Python, there is no explicit new operator like there is in c++ or Java.
- So, we simply call a class as if it were a function to create a new instance of the class:
- We are creating an instance of the Student class and assigning the newly created instance to the variable s.
- We are passing one parameter, args, which will end up as the argument in Student's __init__() method.
- s is now an instance of the Student class. Every class instance has a built-in attribute, __class__, which is the object's class.
- Java programmers may be familiar with the Class class, which contains methods like getName() and getSuperclass() to get metadata information about an object. In Python, this kind of metadata is available through attributes, but the idea is the same.
- We can access the instance's docstring just as with a function or a module. All instances of a class share the same docstring.
We can use the Student class defined above as following:
Unlike C++, the attributes of Python object are public, we can access them using the dot(.) operator:
We can also assign a new value to the attribute:
How about the object destruction?
- Python has automatic garbage collection. Actually, when an object is about to be garbage-collected, its __del()__ method is called, with self as its only argument. But we rarely use this method.
Instance vairables
Let's look at another example:
What is self.id?
- It's an instance variable. It is completely separate from id, which was passed into the __init__() method as an argument. self.id is global to the instance.
- That means that we can access it from other methods. Instance variables are specific to one instance of a class.
- For example, if we create two Student instances with different id values, they will each remember their own values.
Then, let's make two instances:
- Here, we generated instance objects. These objects are just namespaces that have access to their classes' attributes.
- The two instances have links back to the class from which they were created. If we use an instance with the name of an attribute of class object, Python retrieves the name from the class.
- Note that neither s1 nor s2 has a setData() attribute of its own. So, Python follows the link from instance to class to find the attribute.
- In the setData() function inside Student, the value passed in is assigned to self.data. Within a method, self automatically refers to the instance being processed (s1 or s2). Thus, the assignment store values in the instances' namespaces, not the class's.
- When we call the class's display() method to print self.data, we see the self.data differs in each instance. But the name display() itself is the same in s1 and s2:
- Note that we stored different object types in the data member in each instance. In Python, there are no declarations for instance attributes (members).
- They come into existence when they are assigned values. The attribute named data does not even exist in memory until it is assigned within the setData() method.
example A
Then, we generate instance objects:
- The instance objects are just namespaces that have access to their classes' attributes. Actually, at this point, we have three objects: a class and two instances.
- Note that neither a nor a2 has a setData attribute of its own. However, the value passed into the setData is assigned to self.data. Within a method, self automatically refers to the instance being processed (a or a2).
- So, the assignments store values in the instances' namespaces, not the class's. Methods must go through the self argument to get the instance to be processed. We can see it from the output:
As we expected, we stored the value for each instance object even though we used the same method, display. The self made all the differences! It refers to instances.
example B
Superclass is listed in parenthesis in a class header as we see the example below.
MyClassB redefines the display of its superclass, and it replaces the display attribute while still inherits the setData method in MyClassA as we see below:
But for the instance of MyClassA is still using the display, previously defined in MyClassA.
Methods
Here is an example of a class Rectangle with a member function returning its area.
- Note that this version is using direct attribute access for the width and height.
We could have used the following implementing setter and getter methods:
Object attributes are where we store our information, and most of the case the following syntax is enough:
However, there are cases when more flexibility is required. For example, to validate the setter and getter methods, we may need to change the whole code like this:
Properties
- The solution for the issue of flexibility is to allow us to run code automatically on attribute access, if needed.
- The properties allow us to route a specific attribute access (attribute's get and set operations) to functions or methods we provide, enabling us to insert code to be run automatically.
- A property is created by assigning the result of a built-in function to a class attribute:
We pass
- fget: a function for intercepting attribute fetches
- fset: a function for assignments
- fdel: a function for attribute deletions
- doc: receives a documentation string for the attribute
If we go back to the earlier code, and add property(), then the code looks like this:
We can use the class as below:
The example above simply traces attribute accesses. However, properties usually do compute the value of an attribute dynamically when fetched, as the following example illustrates:
- The class defines an attribute V that is accessed as though it were static data. However, it really runs code to compute its value when fetched.
- When the code runs, the value is stored in the instance as state information, but whenever we retrieve it via the managed attribute, its value is automatically squared.
Again, note that the fetch computes the square of the instance's data.
Operator overloading: 2.6 __cmp__() (Removed in 3.0)
- By implementing __cmp__() method, all of the comparison operators(<, ==, !=, >, etc.) will work.
So, let's add the following to our Rectangle class:
- Note that we used the built-in cmp() function to implement __cmp__. The cmp() function returns -1 if the first argument is less than the second, 0 if they are equal, and 1 if the first argument is greater than the second.
- For Python 3.0, we get TypeError: unorderable types. So, we need to use specific methods since the __cmp__() and cmp() built-in functions are removed in Python 3.0.
Operator overloading: __str__
The __str__ is the 2nd most commonly used operator overloading in Python after __init__. The __str__ is run automatically whenever an instance is converted to its print string.
- Let's use the previous example:
If we print the instance, it displays the object as a whole as shown below.
- It displays the object's class name, and its address in memory which is basically useless except as a unique identifier.
- So, let's add the __str__ method:
- The code above extends our class to give a custom display that lists attributes when our class's instances are displayed as a whole, instead of relying on the less useful display. Note that we're doing string % formatting to build the display string in __str__.
__str__ vs __repr__
The difference between __str__ and __repr__ are not that obvious.
- When we use print, Python will look for an __str__ method in our class. If it finds one, it will call it. If it does not, it will look for a __repr__ method and call it. If it cannot find one, it will create an internal representation of our object.
- Not much information from print(x) and just echoing the object x. That's why we do customize the class by using __str__.
But not when we use print(myObjecs). Note that the instances are in the list:
So, we need to define __repr__:
In his book, Learning Python, Mark Lutz summarizes as follows:
- ...__repr__, provides an as-code low-level display of an object when present. Sometimes classes provide both a __str__ for user-friendly displays and a __repr__ with extra details for developers to view.
- Because printing runs __str__ and the interactive prompt echoes results with __repr__, this can provide both target audience with an appropriate display.
sys.argv
- sys.argv is the list of arguments passed to the Python program.
- The first argument, sys.argv[0], is actually the name of the program.
It exists so that we can change our program's behavior depending on how it was invoked. For example:
- sys.argv[0] is operating system dependent whether this is a full pathname or not.
- If the command was executed using the -c command line option to the interpreter, sys.argv[0] is set to the string '-c'.
- If no script name was passed to the Python interpreter, sys.argv[0] is the empty string.
sys.argv[1] is thus the first argument we actually pass to the program.
If we run it with or without any argument, we get the output below:
getopt.getopt(args, options[, long_options])
- The getopt.getopt() parses command line options and parameter list. args is the argument list to be parsed, without the leading reference to the running program.
- Typically, this means sys.argv[1:]. options is the string of option letters that the script wants to recognize, with options that require an argument followed by a colon (:).
- long_options, if specified, must be a list of strings with the names of the long options which should be supported. The leading -- characters should not be included in the option name.
- Long options which require an argument should be followed by an equal sign (=). Optional arguments are not supported.
- To accept only long options, options should be an empty string. Long options on the command line can be recognized so long as they provide a prefix of the option name that matches exactly one of the accepted options:
- For example, if long_options is ['foo', 'frob'], the option --fo will match as --foo, but --f will not match uniquely, so GetoptError will be raised.
The return value consists of two elements:
- the first is a list of (option, value) pairs
- the second is the list of program arguments left after the option list was stripped (this is a trailing slice of args).
- Each option-and-value pair returned has the option as its first element, prefixed with a hyphen for short options (e.g., '-x') or two hyphens for long options (e.g., '--long-option'), and the option argument as its second element, or an empty string if the option has no argument
- The options occur in the list in the same order in which they were found, thus allowing multiple occurrences. Long and short options may be mixed.
If we run with two file long options:
With short options:
Note that the long option does not have to be fully matched:
__call__
- Python runs a __call__ method when an instance is called as a function form. This is more useful when we do some interfacing work with APIs expecting functions. On top of that, we can also retain state info as we see in the later examples of this section.
- It is believed to be the third most commonly used operator overloading method, behind the __init__ and the __str__ and __repr__ according to Mark Lutz (Learning Python).
- Here is an example with a __call__ method in a class. The __call__ method simply prints out the arguments it takes via keyword args.
As another example with a little bit more mixed arguments:
While the __call__ method allows us to use class instances to emulate functions as we saw in the above examples, there is another use of __call__ method: we can retain state info: