Initialization, encapsulation, and privacy
The general Python policy regarding privacy can be summed up as follows: we're all adults here.
Object-oriented design makes an explicit distinction between interface and implementation. This is a consequence of the idea of encapsulation. A class encapsulates a data structure, an algorithm, an external interface, or something meaningful. The idea is to have the capsule separate the class-based interface from the implementation details.
However, no programming language reflects every design nuance. Python, typically, doesn't implement all design considerations as explicit code.
One aspect of a class design that is not fully carried into code is the distinction between the private (implementation) and public (interface) methods or attributes of an object. The notion of privacy in languages that support it (C++ or Java are two examples) is already quite complex. These languages include settings such as private, protected, and public as well as "not specified", which is a kind of semiprivate. The private keyword is often used incorrectly, making subclass definition needlessly difficult.
Python's notion of privacy is simple, as follows:
- It's all essentially public. The source code is available. We're all adults. Nothing can be truly hidden.
- Conventionally, we'll treat some names in a way that's less public. They're generally implementation details that are subject to change without notice, but there's no formal notion of private.
Names that begin with _
are honored as less public by some parts of Python. The help()
function generally ignores these methods. Tools such as Sphinx can conceal these names from documentation.
Python's internal names begin (and end) with __
. This is how Python internals are kept from colliding with application features above the internals. The collection of these internal names is fully defined by the language reference. Further, there's no benefit to trying to use __
to attempt to create a "super private" attribute or method in our code. All that happens is that we create a potential future problem if a release of Python ever starts using a name we chose for internal purposes. Also, we're likely to run afoul of the internal name mangling that is applied to these names.
The rules for the visibility of Python names are as follows:
- Most names are public.
- Names that start with
_
are somewhat less public. Use them for implementation details that are truly subject to change. - Names that begin and end with
__
are internal to Python. We never make these up; we use the names defined by the language reference.
Generally, the Python approach is to register the intent of a method (or attribute) using documentation and a well-chosen name. Often, the interface methods will have elaborate documentation, possibly including doctest
examples, while the implementation methods will have more abbreviated documentation and may not have doctest
examples.
For programmers new to Python, it's sometimes surprising that privacy is not more widely used. For programmers experienced in Python, it's surprising how many brain calories get burned sorting out private and public declarations that aren't really very helpful because the intent is obvious from the method names and the documentation.