Programming

Programming#

Programming is all about creating applications, implementing algorithms and data structures to create a digital workflow for solving a task using computer assistance. For this the computer has to be fed with a set of commands it knows how to execute. These are based on a set of instructions for example the CPU has built in and can be triggered in a machine level programming language, which is all binary and very trivial in a way. Basically all a CPU can actually do is to load information from somewhere in RAM into its own registers, perform some bit operations on it and put the result back into some location in RAM. That’s it. Humans however want to solve large and complex problems using that trivial operation capabilities of a CPU and are not very good mapping the complexity of their task to a set of those bit level operations the CPU can perform.

This is where the programming languages come into play. They offer a level of abstraction to the machine level instruction set to create something more readable and understandable for us humans. And there are different strategies and levels of abstraction here. Let’s have a quick look at the ideas of some of them.

First of all there is the assembly language. This is very close to the machine level instruction set, but actually human readable. Its based on a syntax where instructions paired with operands, which can be CPU registers and values for example are defined line by line, and transferred into the binary machine language using a compiler. This is the lowest level for a programming language, offering maximum control on the way an algorithm is performed, but on the other hand still being to simple to aid in creating complex tasks.

Next in line would be a programming language like C, which offers a much more understandable level of abstraction, but is still considered to be near the machine. The idea here is that certain flows of assembly instructions repeat, as their aim is similar in different tasks, so higher level keywords can be introduced, pretty comparable to an alias, which can be translated into assembly language. This greatly helps reducing the amount of work to create more complex tasks and also reduces the risk of errors in the code, as well known and tested code blocks can be reduced, at the cost of a loss in flexibility maybe, as quite some variants of alorithmic tasks with slight differences need to be combined here, to make this strategy work. Later the generated assembly code can be compiled into machine code and executed to solve the task.

This is also the first level of abstraction, which allows for machine independent programming. It’s the compilers task to generate the correct assembly language code for the target CPU. This was a key feature for the quick distribution of programs over the many CPU architecures existing and allowing to chose the right hardware platform for the job, whilst still offering a known work environment on it based on the very same set of applications you’re familiar with.

There are programming languages on top of that, such as C++ and C#, which are much more data and task related and abstract the machine one further step away. And this is the level of abstraction most humans need in order to successfully solve real world problems using a computer. Basically these languages share the skills of the programming language C as a subset, with some very near machine operating capabilities removed, and add very powerful data handling capabilities to help working with data and algorithms in a human way. Yet still, the high level source code is still transferred into the machine language using a compiler, creating an executable file on disk which, when started solves the problem.

In contrast to this there is a family of programming languages, which do not work with a compiler. These interpret the source code as it is read from a file and trigger corresponding machine level action to make the algorithm work. For humans this offers the very same level of abstraction to the machine as compiler based languages do. Obviously though the workflow is completely different. These interpreting languages need to analyze and execute the source code every time the program is executed, instead of doing this just once and writing a binary representation of the program, which can later be called over and over again. Sounds like a bad idea first, but can even have some advantages.

The most prominent program language in this family is probably Basic. In a way it can be considered being as low level as the programming language C is and it has a comparable set of instructions to work with the hardware. Algorithmic details and structures cannot be detected in advance though, as the step of compiling and replacing these patterns with code pre-defined code blocks is missing, as the code is interpreted line by line. As a consequence the resulting code execution is less efficient and less performant. On the other hand the source code can be easily adjusted and enhanced to match a new situation, as a re-compile step is not necessary here. This is often a major drawback for compiler based languages. Each alteration of the code demands a re-compile step, and a compiler is not always available on the computer the task should be run on. As a consequence development steps with compiler based languages can be slower than interpreter based approaches.

Python is also an interpreted language, sharing many of the disadvantages of other interpreted programming languages. It has some runtime tweaks though to optimize the efficiency and performance, and it is much more comparable to languages such as C++, as it brings a large set of builtin features and a very large standard library shipped with every Python installation. We will learn a bit more on this later on in this book.

What all high level programming languages have in common though is to be based on structure and code blocks. Python has a very simple approach here. Let’s have a closer look.

Python syntax basics#

Here’s a very simple Python program. It calculates the result of a chosen trigonometric function for a given number. Don’t focus on all the unknown keywords here. We’ll get to that later. Just have a look at the text and recognize the structure in it. It shows some of the key aspects in Python

#!/usr/bin/env python3
import math

function = "cos"
# function = "sin"
number = 0.0

if function == "cos":
    result = math.cos(number)
    print("The calculated cosine value is", result)
elif function == "sin":
    result = math.sin(number)
    print("The calculated sine value is", result)
The calculated cosine value is 1.0

The first line is basically helpful on Unix like systems only, including Linux and macOS. When a file containing this source code is executed directly, say by triggering it from a file browser for example, this line tells the operating system how to execute it. This line will make the operating system start a Python interpreter named python3, within a default set of environment variables, which are provided by a program called env located in the /usr/bin/ folder of the operating system. Unix operating systems do not care of file suffixes provided. The usually chosen .py suffix for Python files is not key for execting this script using Python. On Windows systems on the other hand the suffix of the file would be the trigger for the operating system to use the Python interpreter to run the rest of the file. This top line is preceded by a hashtag symbol and an exclamation mark. The combination is called a shebang and marks just this trigger for the operating system to look for an interpreter to execute what comes next. The hashtag symbol is also what marks the rest of the line to be a comment. This prevents Python from interpreting this line of code itself. A neat concept!

The next line imports the math module. We’ll find the desired trigonometric functions sin and cos in there. Python allows to import necessary modules at any time, at any position in the source code. Following coding standards though, you are invited to place all import statements at the top of the Python script.

The following line is an empty line. As a consequence nothing happens when parsing and interpreting this line. You can use as many empty lines you desire. It’s very helpful to group statements and commands which have a strong relation, or to highlight a piece of source code which should catch the readers eye. There are however rules for the use of empty lines as well. We’ll learn about this later on.

Next in row is a variable assignment. We create a variable called function to hold the value cos, which we’ll be using a few lines later. The follow up line looks comparable, but is preceeded with a hashtag symbol again. We already known that this line makes a comment line, so as a consequence this line will not be interpreted when running the program. This can be used to deactivate source code in a specific context. Here, we obviously use it to distinguish between calculating the cos or the sin value of a number. To calculate the sin value, we would just need to uncomment the respective line and comment out the line where the value cos gets assigned to the function variable.

For calculating a sine or cosine value we need an argument. This is done by assigning the value 0 to another variable called number in the next line. Next we see another empty line, which is used for structuring purposes. A variable assignment block ends here.

This is followe by another block of functionality, where we distinguish which function to use when calculating the result with an if-elif pattern. We’ll see the details here later on. What’s most important here is that the if and elif lines end with a colon and that the two respective following lines start indented. This is a key feature in Python to create a block of code. Every consecuting statement or command starting at the same level of indentation belongs to the same code block. A code block ends if the level of indentation changes. Python recognizes whitespace (space and tab symbols) for indentation and just looks for levels of indentation. At least one whitespace in front is necessary to create a new block. To increase readability it is advised though to use four spaces for a single level of indentation, and to not use tabs for this. Most Python capable editors have settings to configure this behaviour and are have proper default values already set.

So in this context, if the function variable was assigned the value cos, the following code block calculating the value of cos(0) and assigning it to the variable result as well as printing this very result in the scope of a meaningful text is done. The code block ends after that due to the change of indentation which leads to the elif related code block, which had been executed, if the value sin would have been assigned to the variable function.

A very simple, readable and consistent approach!