{ "metadata": { "kernelspec": { "display_name": "Bash", "language": "bash", "name": "bash" }, "language_info": { "codemirror_mode": "shell", "file_extension": ".sh", "mimetype": "text/x-sh", "name": "bash" } }, "nbformat": 4, "nbformat_minor": 5, "cells": [ { "id": "metadata", "cell_type": "markdown", "source": "
But before you dive into writing some more code and\nsharing it with others, ask yourself what kind of code should you be writing and publishing? It may be\nworth spending some time learning a bit about Python\ncoding style conventions to make sure that your code is consistently formatted and readable by yourself and others.
\n\n\n“Any fool can write code that a computer can understand. Good programmers write code that humans can understand.” - Martin Fowler, British software engineer, author and international speaker on software development
\n
\n\nComment\nThis tutorial is significantly based on the Carpentries lesson “Intermediate Research Software Development”.
\n
\n\nAgenda\nIn this tutorial, we will cover:
\n\n
One of the most important things we can do to make sure our code is readable by others\n(and ourselves a\nfew months down the line) is to make sure that it is descriptive, cleanly and consistently formatted and uses sensible,\ndescriptive names for variable, function and module names. In order to help us format our code, we generally follow\nguidelines known as a style guide. A style guide is a set of conventions that we agree upon with our colleagues or\ncommunity, to ensure that everyone contributing to the same project is producing code which looks similar in style.\nWhile a group of developers may choose to write and agree upon a new style guide unique to each project,\nin practice many programming languages have a single style guide which is\nadopted almost universally by the communities around the world. In Python, although we do have a choice of style guides\navailable, the PEP8 style guide is most commonly used.\nA Python Enhancement Proposal (PEP) is a design document for the Python community, typically\nspecifications or conventions for how to do something in Python, a description of a new feature in Python, etc.
\n\n\n\nOne of the\nkey insights from Guido van Rossum,\none of the PEP8 authors, is that code is read much more often than it is\nwritten. Style guidelines are intended to improve the readability of code and make it consistent across the\nwide spectrum of Python code. Consistency with the style guide is important. Consistency within a project is more\nimportant. Consistency within one module or function is the most important. However, know when to be inconsistent –\nsometimes style guide recommendations are just not applicable. When in doubt, use your best judgment.\nLook at other examples and decide what looks best. And don’t hesitate to ask!
\n
A full list of style guidelines for this style\nis available from the PEP8 website; here we highlight a few.
\nPython is a kind of language that uses indentation as a way of grouping statements that belong to a particular\nblock of code. Spaces are the recommended indentation method in Python code. The guideline is to use 4 spaces per indentation level -\nso 4 spaces on level one, 8 spaces on level two and so on.
\nMany people prefer the use of tabs to spaces to indent the code for many reasons (e.g. additional typing, easy to\nintroduce an error by missing a single space character, etc.) and do not follow this guideline. Whether you decide to\nfollow this guideline or not, be consistent and follow the style already used in the project.
\n\n\n\nFor many users of Braille Refreshable Displays, encoding the indentation with a tab character is easier to read and more efficient. Rather than 8 space characters, they would only see 2 tab characters which take up less space and are easier to comprehend the indentation level.
\n
\n\n\nPython 2 allowed code\nindented with a mixture of tabs and spaces. Python 3 disallows mixing the use of tabs and spaces for indentation.\nWhichever you choose, be consistent throughout the project.
\n
\n\n\nMany IDEs and editors have built in support for automatically managing tabs and spaces and auto-converting them back and forth.
\n
There are more complex rules on indenting single units of code that continue over several lines, e.g. function,\nlist or dictionary definitions can all take more than one line. The preferred way of wrapping such long lines is by\nusing Python’s implied line continuation inside delimiters such as parentheses (()
), brackets ([]
) and braces\n({}
), or a hanging indent.
# Add an extra level of indentation (extra 4 spaces) to distinguish arguments from the rest of the code that follows\ndef long_function_name(\nvar_one, var_two, var_three,\nvar_four):\nprint(var_one)\n# Aligned with opening delimiter\nfoo = long_function_name(var_one, var_two,\nvar_three, var_four)\n# Use hanging indents to add an indentation level like paragraphs of text where all the lines in a paragraph are\n# indented except the first one\nfoo = long_function_name(\nvar_one, var_two,\nvar_three, var_four)\n# Using hanging indent again, but closing bracket aligned with the first non-blank character of the previous line\na_long_list = [\n[[1, 2, 3], [4, 5, 6], [7, 8, 9]], [[0.33, 0.66, 1], [0.66, 0.83, 1], [0.77, 0.88, 1]]\n]\n# Using hanging indent again, but closing bracket aligned with the start of the multiline contruct\na_long_list2 = [\n1,\n2,\n3,\n# ...\n 79\n]\n
More details on good and bad practices for continuation lines can be found in\nPEP8 guideline on indentation.
\nAll lines should be up to 80 characters long; for lines containing comments or docstrings (to be covered later) the\nline length limit should be 73 - see this discussion for reasoning behind these numbers. Some teams strongly prefer a longer line length, and seemed to have settled on the\nlength of 100. Long lines of code can be broken over multiple lines by wrapping expressions in delimiters, as\nmentioned above (preferred method), or using a backslash (\\
) at the end of the line to indicate\nline continuation (slightly less preferred method).
# Using delimiters ( ) to wrap a multi-line expression\nif (a == True and\nb == False):\n# Using a backslash (\\) for line continuation\nif a == True and \\\nb == False:\n
Lines should break before binary operators so that the operators do not get scattered across different columns\non the screen. In the example below, the eye does not have to do the extra work to tell which items are added\nand which are subtracted:
\n# PEP 8 compliant - easy to match operators with operands\nincome = (gross_wages\n+ taxable_interest\n+ (dividends - qualified_dividends)\n- ira_deduction\n- student_loan_interest)\n
Top-level function and class definitions should be surrounded with two blank lines. Method definitions inside a class\nshould be surrounded by a single blank line. You can use blank lines in functions, sparingly, to indicate logical sections.
\nAvoid extraneous whitespace in the following situations:
\nImmediately inside parentheses, brackets or braces
\n# PEP 8 compliant:\nmy_function(colour[1], {id: 2})\n# Not PEP 8 compliant:\nmy_function( colour[ 1 ], { id: 2 } )\n
Immediately before a comma, semicolon, or colon (unless doing slicing where the colon acts like a binary operator\nin which case it should should have equal amounts of whitespace on either side)
\n# PEP 8 compliant:\nif x == 4: print(x, y); x, y = y, x\n# Not PEP 8 compliant:\nif x == 4 : print(x , y); x , y = y, x\n
Immediately before the open parenthesis that starts the argument list of a function call
\n# PEP 8 compliant:\nmy_function(1)\n# Not PEP 8 compliant:\nmy_function (1)\n
Immediately before the open parenthesis that starts an indexing or slicing
\n# PEP 8 compliant:\nmy_dct['key'] = my_lst[id]\nfirst_char = my_str[:, 1]\n# Not PEP 8 compliant:\nmy_dct ['key'] = my_lst [id]\nfirst_char = my_str [:, 1]\n
More than one space around an assignment (or other) operator to align it with another
\n# PEP 8 compliant:\nx = 1\ny = 2\nstudent_loan_interest = 3\n# Not PEP 8 compliant:\nx = 1\ny = 2\nstudent_loan_interest = 3\n
\\
) for continuation lines and have a space after it, the continuation line will not be\ninterpreted correctly.Don’t use spaces around the = sign when used to indicate a keyword argument assignment or to indicate a\ndefault value for an unannotated function parameter
\n# PEP 8 compliant use of spaces around = for variable assignment\naxis = 'x'\nangle = 90\nsize = 450\nname = 'my_graph'\n# PEP 8 compliant use of no spaces around = for keyword argument assignment in a function call\nmy_function(\n1,\n2,\naxis=axis,\nangle=angle,\nsize=size,\nname=name)\n
In Python, single-quoted strings and double-quoted strings are the same. PEP8 does not make a recommendation for this\napart from picking one rule and consistently sticking to it. When a string contains single or double quote characters,\nuse the other one to avoid backslashes in the string as it improves readability.
\nThere are a lot of different naming styles in use, including:
\nAs with other style guide recommendations - consistency is key. Pick one and stick to it, or follow the one already\nestablished if joining a project mid-way. Some things to be wary of when naming things in the code:
\nStronlgy consider not using single character identifiers wherever possible. Use descriptive variable names when possible, it is no slower to execute, and a lot faster to read and comprehend for a reader.
\n\n\n\n\n
\n- Function and variable names should be lowercase, with words separated by underscores as necessary to improve readability.
\n- Class names should normally use the CapitalisedWords convention.
\n- Modules should have short, all-lowercase names. Underscores can be used in the module name if it improves readability.
\n- Packages should also have short, all-lowercase names, although the use of underscores is discouraged.
\nA more detailed guide on\nnaming functions, modules, classes and variables\nis available from PEP8.
\n
Comments allow us to provide the reader with additional information on what the code does - reading and understanding\nsource code is slow, laborious and can lead to misinterpretation, plus it is always a good idea to keep others in mind\nwhen writing code. A good rule of thumb is to assume that someone will always read your code at a later date,\nand this includes a future version of yourself. It can be easy to forget why you did something a particular way in six\nmonths’ time. Write comments as complete sentences and in English unless you are 100% sure the code will never be read\nby people who don’t speak your language.
\n\n\n\nAs a side reading, check out the ‘Putting comments in code: the good, the bad, and the ugly’ blogpost.\nRemember - a comment should answer the ‘why’ question. Occasionally the “what” question.\nThe “how” question should be answered by the code itself.
\n
Block comments generally apply to some (or all) code that follows them, and are indented to the same level as that\ncode. Each line of a block comment starts with a #
and a single space (unless it is indented text inside the comment).
def fahr_to_cels(fahr):\n# Block comment example: convert temperature in Fahrenheit to Celsius\n cels = (fahr + 32) * (5 / 9)\nreturn cels\n
An inline comment is a comment on the same line as a statement. Inline comments should be separated by at least two\nspaces from the statement. They should start with a #
and a single space and should be used sparingly.
def fahr_to_cels(fahr):\ncels = (fahr + 32) * (5 / 9) # Inline comment example: convert temperature in Fahrenheit to Celsius\n return cels\n
Python doesn’t have any multi-line comments, like you may have seen in other languages like C++ or Java. However, there\n are ways to do it using docstrings as we’ll see in a moment.
\nThe reader should be able to understand a single function or method from its code and its comments, and should not have to look elsewhere in the code for clarification. The kind of things that need to be commented are:
\nHowever, there are some restrictions. Comments that simply restate what the code does are redundant, and comments must be\n accurate and updated with the code, because an incorrect comment causes more confusion than no comment at all.
\nNow given all of the above rules we’ll go ahread and re-format some code to match the style guide. Please review them carefully and prepare yourself for an hour of editing text… no, of course not.
\nThere is no reason to do this manually, we’re learning about computers and reformatting text is something computers are great at! There is a project called Black which automates this process, and removes the potential arguing over coding style, which is truly it’s greatest success.
\n\n\nHands-on: Reformatting Code with Black\n\n
\n- \n
\nInstall black
\n\npip3 install black\n
- \n
\nRun Black
\n\nblack .\n
- \n
\nDone. That’s it.
\n
\n\nHands-on: Optional Exercise: Improve Code Style of Your Other Python Projects\nIf you have another Python project, check to which extent it conforms to PEP8 coding style.
\n
If the first thing in a function is a string that is not assigned to a variable, that string is attached to the\nfunction as its documentation. Consider the following code implementing function for calculating the nth\nFibonacci number:
\ndef fibonacci(n):\n\"\"\"Calculate the nth Fibonacci number.\n\n A recursive implementation of Fibonacci array elements.\n\n :param n: integer\n :raises ValueError: raised if n is less than zero\n :returns: Fibonacci number\n \"\"\"\nif n < 0:\nraise ValueError('Fibonacci is not defined for N < 0')\nif n == 0:\nreturn 0\nif n == 1:\nreturn 1\nreturn fibonacci(n - 1) + fibonacci(n - 2)\n
Note here we are explicitly documenting our input variables, what is returned by the function, and also when the\nValueError
exception is raised. Along with a helpful description of what the function does, this information can\nact as a contract for readers to understand what to expect in terms of behaviour when using the function,\nas well as how to use it.
A special comment string like this is called a docstring. We do not need to use triple quotes when writing one, but\nif we do, we can break the text across multiple lines. Docstrings can also be used at the start of a Python module (a file\ncontaining a number of Python functions) or at the start of a Python class (containing a number of methods) to list\ntheir contents as a reference. You should not confuse docstrings with comments though - docstrings are context-dependent and should only\nbe used in specific locations (e.g. at the top of a module and immediately after class
and def
keywords as mentioned).\nUsing triple quoted strings in locations where they will not be interpreted as docstrings or\nusing triple quotes as a way to ‘quickly’ comment out an entire block of code is considered bad practice.
In our example case, we used\nthe Sphynx/ReadTheDocs docstring style formatting\nfor the param
, raises
and returns
- other docstring formats exist as well.
\n\n\nPEP 257 is another one of Python Enhancement Proposals and this one deals with docstring conventions to\nstandardise how they are used. For example, on the subject of module-level docstrings, PEP 257 says:
\n\n\nThe docstring for a module should generally list the classes, exceptions and functions (and any other objects) that\nare exported by the module, with a one-line summary of each. (These summaries generally give less detail than the\nsummary line in the object’s docstring.) The docstring for a package\n(i.e., the docstring of the package’s
\n__init__.py
module) should also list the modules and subpackages exported by\nthe package.Note that
\n__init__.py
file used to be a required part of a package (pre Python 3.3) where a package was typically\nimplemented as a directory containing an__init__.py
file which got implicitly executed when a package was imported.
So, at the beginning of a module file we can just add a docstring explaining the nature of a module. For example, if\nfibonacci()
was included in a module with other functions, our module could have at the start of it:
\"\"\"A module for generating numerical sequences of numbers that occur in nature.\n\nFunctions:\n fibonacci - returns the Fibonacci number for a given integer\n golden_ratio - returns the golden ratio number to a given Fibonacci iteration\n ...\n\"\"\"\n...\n
The docstring for a function or a module is returned when\ncalling the help
function and passing its name - for example from the interactive Python console/terminal available\nfrom the command line or when rendering code documentation online\n(e.g. see Python documentation).\nPyCharm also displays the docstring for a function/module in a little help popup window when using tab-completion.
help(fibonacci)\n