{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "6TShUYbvSsUE" }, "source": [ "# Introduction to Programming with Python\n", "\n", "---\n", "\n", "**A Tufts University Data Lab Workshop** \n", "Written by Uku-Kaspar Uustalu\n", "\n", "Website: [go.tufts.edu/introPython](https://go.tufts.edu/introPython) \n", "Contact: \n", "Resources: [go.tufts.edu/python](https://go.tufts.edu/python)\n", "\n", "Last updated: `2022-10-25`\n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": { "id": "M7BKrSqBSsUE" }, "source": [ "## Introduction to Python Notebooks\n", "\n", "Python Notebooks are made up of **cells**. There are three different kinds of cells:\n", "1. **Code** cells for Python code.\n", "2. **Markdown** cells for text, images, LaTeX equations, etc.\n", "3. **Raw** cells for additional non-Python code that modifies the notebook when exporting it to a different format.\n", "\n", "Each cell can be run (and re-run) independently. \n", "To run a cell, select it by clicking on it and then press Shift+Enter or Ctrl+Enter. \n", "After running a cell, the following cell gets automatically selected.\n", "\n", "This is a **Markdown cell**. Markdown is a lightweight language for formatting text. \n", "To see the code behind a Markdown cell, ***double-click*** on it. \n", "To render the Markdown code back into formatted text, you must run the cell.\n", "\n", "Here are some useful Markdown references:\n", "- [GitHub Markdown Guide](https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax)\n", "- [Google Colab Markdown guide](https://colab.research.google.com/notebooks/markdown_guide.ipynb)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "S4Hcx9QESsUF" }, "outputs": [], "source": [ "# this is a code cell\n", "# select this cell and press Shift+Enter or Ctrl+Enter to run it\n", "\n", "# in Python, the hashtag (or pound) symbol denotes a comment line\n", "# anything on a comment line is intended for humans and ignored by Python\n", "# adding comments to your code is very important for debugging and sharing\n", "\n", "print(\"Hello World!\")" ] }, { "cell_type": "markdown", "metadata": { "id": "mYTS2es5SsUF" }, "source": [ "↑↑↑ \n", "The **output** of code cells appears right here in the notebook just below the corresponding cell. \n", "Output is also stored in the notebook. Any output of a **saved** notebook will still be there after you close it and open it up again. \n", "If you run a cell again, the output will be overwritten. To clear outputs, go to *Edit > Clear All Outputs*." ] }, { "cell_type": "markdown", "metadata": { "id": "IODatl0QSsUF" }, "source": [ "To learn more about Python Notebooks:\n", "- [JupyterLab User Guide](https://jupyterlab.readthedocs.io/en/stable/)\n", "- [Google Colab Overview](https://colab.research.google.com/notebooks/basic_features_overview.ipynb)\n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": { "id": "uCtWBXoySsUF" }, "source": [ "### Understanding how Outputs Work\n", "The output from the last line in a block usually gets printed out (even without a print statement)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "6dIXlNZKSsUF" }, "outputs": [], "source": [ "spam = 500\n", "spam * 2" ] }, { "cell_type": "markdown", "metadata": { "id": "Mm__1r4HSsUG" }, "source": [ "But if the output from the last line in a block gets diverted elsewhere (like into a variable), no output will be displayed." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "XehQCrtmSsUG" }, "outputs": [], "source": [ "new_spam = spam * 2" ] }, { "cell_type": "markdown", "metadata": { "id": "UKPqS6ujSsUG" }, "source": [ "Referring to a variable will output its contents, but only if it happens on the last line of a block." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ayKxF1NISsUG" }, "outputs": [], "source": [ "# the contents of spam get outputted\n", "spam" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "0meOg6BqSsUG" }, "outputs": [], "source": [ "# only the contents of new_spam get outputted\n", "spam\n", "new_spam" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "f7QqKxhlSsUG" }, "outputs": [], "source": [ "# nothing gets outputted\n", "new_spam\n", "new_spam = new_spam * 2 // 3 # remember, this is floor division" ] }, { "cell_type": "markdown", "metadata": { "id": "wUUWgVX-SsUG" }, "source": [ "Knowing this behavior makes it easy to see the contents of variables and the results of expressions. Just make sure they are on the last line in a block, or run them in their own block. However, if you want to be completely sure that something gets outputted in the notebook, it is wise to rely on the good old `print()` function." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "NTBK-neBSsUG" }, "outputs": [], "source": [ "print(\"now we have\", new_spam, \"cans of spam\")\n", "print(\"previously we had\", spam, \"cans of spam\")" ] }, { "cell_type": "markdown", "metadata": { "id": "VnEUZR69SsUG" }, "source": [ "---\n", "\n", "### Hands-On Exercise\n", "Remember this from before? Run this cell, then change the value of the variables `name` and `age` and run the cell again. \n", "Note how the previous output gets overwritten." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "H7AqM98cSsUH" }, "outputs": [], "source": [ "name = 'Uku' # remember, in Python you can use either ' or \" to denote strings\n", "age = 100\n", "\n", "print(\"My name is\", name, \"and I'm\", age)" ] }, { "cell_type": "markdown", "metadata": { "id": "xuqmOH97SsUH" }, "source": [ "---\n", "\n", "## Lists\n", "\n", "* A **list** is a *mutable* collection/sequence of values that preserves order\n", "* Similar to *arrays* or *vectors* in other languages, but more flexible and forgiving\n", "* Values can be added to or removed from a list and the entire list can be reordered\n", "* A list can contain many different data types and even objects (other lists or data structures)\n", "* You use square brackets, **`[ ]`**, to denote a list in Python\n", "* Individual elements are separated by commas\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "eMcSTeobSsUH" }, "outputs": [], "source": [ "# example of a list\n", "# these might or might not be my exam scores from freshman year\n", "scores = [94, 87, 85, 89, 72, 98, 96, 82, 92]\n", "scores" ] }, { "cell_type": "markdown", "metadata": { "id": "bDeY_e7vSsUH" }, "source": [ "If you are ever unsure what kind of data is stored in a variable, you can use `type()`. \n", "*It is also a useful debugging tool.*" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "JV8l0pAxSsUH" }, "outputs": [], "source": [ "type(scores)" ] }, { "cell_type": "markdown", "metadata": { "id": "QpvJzPQJSsUH" }, "source": [ "We can access elements of a list using `list[index]`. This provides both read and write functionality. \n", "We can use it to read elements from a list but we can also use it with the assignment operator `=` to replace/overwrite elements. \n", "However, despite the flexibility of python, you cannot use this method to add new elements to a list. \n", "Using `list[index]` with a non-existing index will result in an error.\n", "\n", "**Remember that Python uses zero-based indexing!**" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ZcyFPQkcSsUH" }, "outputs": [], "source": [ "# let's take a look at the first element\n", "scores[0]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "Jd8G3Xn9SsUH" }, "outputs": [], "source": [ "# let's say we got to retake the third exam and we improved our score by 6 points\n", "scores[2] = scores[2] + 6\n", "scores" ] }, { "cell_type": "markdown", "metadata": { "id": "r8fhKnhkSsUH" }, "source": [ "Note how the expression to the right of the assignment operator `=` always gets evaluated before the expression to the left of the assignment operator. That allows us to easily get and overwrite the values of variables or list elements using a simple one-line statement.\n", "\n", "To check if a list contains an element, we can simply ask in English." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "71PLCQBlSsUH" }, "outputs": [], "source": [ "print(94 in scores)\n", "print(93 in scores)\n", "print(93 not in scores)" ] }, { "cell_type": "markdown", "metadata": { "id": "xBh_7QQfSsUH" }, "source": [ "Python has a lot of built-in functions and methods for working with lists. Check out the documentation for an overview.\n", "\n", "* Built-in functions: \n", "* List methods: \n", "\n", "For example, we can use the `.append()` method to add elements to a list." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "wGzU0sQDSsUH" }, "outputs": [], "source": [ "# I just took another test and got a 70 :(\n", "# let's add it to the list anyways\n", "scores.append(70)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "bXcN0MMPSsUH" }, "outputs": [], "source": [ "# now take a look at `scores` again\n", "scores" ] }, { "cell_type": "markdown", "metadata": { "id": "oZVvgjKaSsUH" }, "source": [ "The `len()` function gives us the number of elements in a list (or any other iterable)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "Gqtm7KSVSsUH" }, "outputs": [], "source": [ "len(scores)" ] }, { "cell_type": "markdown", "metadata": { "id": "L5YuvHw7SsUH" }, "source": [ "To look at a range of elements, we can use slicing: `list[start:end]`" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "a8N_K1QCSsUH" }, "outputs": [], "source": [ "# let's look at the fist three elements\n", "scores[0:3]" ] }, { "cell_type": "markdown", "metadata": { "id": "b3fAflecSsUH" }, "source": [ "To slice from the beginning of the list or until the end of the list, we can just omit the respective index. \n", "Note that you can also use negative indices to count from the end of the list." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "UC-QNSHLSsUH" }, "outputs": [], "source": [ "# we can omit the zero to get the first three elements\n", "scores[:3]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "9dNCMTMNSsUH" }, "outputs": [], "source": [ "# let's look at the last element using negative indexing\n", "scores[-1]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "NQ66OEwzSsUH" }, "outputs": [], "source": [ "# using the same logic, we can also easily look at the last three elements\n", "scores[-3:]" ] }, { "cell_type": "markdown", "metadata": { "id": "D7iD8YNiSsUH" }, "source": [ "Slicing actually also has an optional third argument: `list[start:end:step]` \n", "When we do not specify `step`, it defaults to one. \n", "You can still omit the `start` or `end` index as before, even when using `step`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "5eojUc5TSsUH" }, "outputs": [], "source": [ "# let's look at the second, fourth, and sixth element\n", "# remember that the first element is at index zero\n", "scores[1:6:2]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "t-AATm6wSsUK" }, "outputs": [], "source": [ "# what about the first, third, and fifth?\n", "scores[:5:2]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "lTOjr3e0SsUL" }, "outputs": [], "source": [ "# omit every second element in the whole list\n", "scores[::2]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "WSWqSZSLSsUL" }, "outputs": [], "source": [ "# only look at every second element in the whole list\n", "scores[1::2]" ] }, { "cell_type": "markdown", "metadata": { "id": "1-P_gQygSsUL" }, "source": [ "But what if we want to see the three highest scores? \n", "Let's try the built-in function `sorted()`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "fQmBDpFWSsUL" }, "outputs": [], "source": [ "sorted(scores)" ] }, { "cell_type": "markdown", "metadata": { "id": "lZFJD-3QSsUL" }, "source": [ "Is there any way to change the order from ascending to descending? \n", "We can use `help(sorted)` to find out." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "rhdLc-zWSsUL" }, "outputs": [], "source": [ "help(sorted)" ] }, { "cell_type": "markdown", "metadata": { "id": "lfUQ-9eoSsUL" }, "source": [ "Or we could take a look at the documentation online: " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "tCcb0O49SsUL" }, "outputs": [], "source": [ "sorted(scores, reverse=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "oJbX2MTsSsUL" }, "outputs": [], "source": [ "# now we can extract the three highest scores\n", "sorted(scores, reverse=True)[0:3]" ] }, { "cell_type": "markdown", "metadata": { "id": "171kmyD4SsUL" }, "source": [ "---\n", "\n", "### Functions and methods\n", "\n", "A quick search on StackOverflow reveals that you can also use the `.sort()` method to sort a list.\n", "\n", "But what is the difference?" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "EPShXPX8SsUL" }, "outputs": [], "source": [ "sorted(scores) # this is a function" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "RAgCUy6wSsUL" }, "outputs": [], "source": [ "scores # the function does not modify the original list, it outputs a copy" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "uEPHv8uISsUL" }, "outputs": [], "source": [ "scores.sort() # this is a method, it does not output anything" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "5ziGk6jySsUL" }, "outputs": [], "source": [ "scores # instead, it modifies the original list" ] }, { "cell_type": "markdown", "metadata": { "id": "Gf5oo3PSSsUL" }, "source": [ "---\n", "\n", "### Strings as Lists\n", "\n", "You can use slicing and indexing to extract letters from a string, just like you would from a list." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "0p2UnwnySsUL" }, "outputs": [], "source": [ "food = 'egg and spam'" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "4M35qWgcSsUL" }, "outputs": [], "source": [ "# we can select single letters (remember that the first letter is at index 0)\n", "print(food[0])\n", "print(food[1])\n", "print(food[2])" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ihbhADQjSsUL" }, "outputs": [], "source": [ "# or we can select a range of letters\n", "print(food[:3])\n", "print(food[4:7])\n", "print(food[8:])" ] }, { "cell_type": "markdown", "metadata": { "id": "eij6P8KvSsUL" }, "source": [ "You can also use `in` to check for substrings." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "rYy6bmIiSsUL" }, "outputs": [], "source": [ "print('e' in food)\n", "print('egg' in food)\n", "print('ham' in food)" ] }, { "cell_type": "markdown", "metadata": { "id": "vweOYt-USsUL" }, "source": [ "---\n", "\n", "### String Methods\n", "\n", "Strings also have a tons of useful methods. \n", "Here is a good reference: " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "I4Ir37NbSsUL" }, "outputs": [], "source": [ "# for example, we can easily convert strings to uppercase\n", "food.upper()" ] }, { "cell_type": "markdown", "metadata": { "id": "-By0fQgiSsUL" }, "source": [ "Note that string methods *usually* do not modify the original string, unlike list methods." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "eQATcldsSsUM" }, "outputs": [], "source": [ "food" ] }, { "cell_type": "markdown", "metadata": { "id": "I3taOSDxSsUM" }, "source": [ "---\n", "\n", "### Exercises involving Lists and Strings\n", "\n", "Replace `\"baked beans\"` with `\"eggs\"` in the string `menu_item` below.\n", "\n", "*You might what to search for an appropriate method here: *" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "KIB9nZANSsUM" }, "outputs": [], "source": [ "menu_item = 'spam spam spam spam spam spam baked beans spam spam spam and spam'" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "GGUKagMASsUM" }, "outputs": [], "source": [ "# answer here\n" ] }, { "cell_type": "markdown", "metadata": { "id": "vIq5SP5OSsUM" }, "source": [ "
\n", "Click to reveal solution\n", "\n", "\n", "> A quick peek at the string methods documentation reveals that `.replace()` would be useful for this. \n", "> Because string methods create a new string, we need to overwrite the variable in order to replace it.\n", "> ```python\n", "> menu_item = menu_item.replace('baked beans', 'eggs')\n", "> ```\n", "\n", "> Feel free to try this out by copying the code from above into a code cell and running it. \n", "> Call `menu_item` or use `print(menu_item)` to investigate the variable and validate your implementation.\n", "
" ] }, { "cell_type": "markdown", "metadata": { "id": "O9uyoodZSsUM" }, "source": [ "We can use the `.split()` method to split a string into list elements on white space or any other specified string." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "fdKGwaWtSsUM" }, "outputs": [], "source": [ "menu_item_list = menu_item.split()\n", "menu_item_list" ] }, { "cell_type": "markdown", "metadata": { "id": "3rJ7rYQxSsUM" }, "source": [ "How many times does `\"spam\"` appear in the list `menu_item_list`?\n", "\n", "*You might want to search the web or refer to the documentation on list methods: *" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "p1rOxBQASsUM" }, "outputs": [], "source": [ "# answer here\n" ] }, { "cell_type": "markdown", "metadata": { "id": "VGvh69DLSsUM" }, "source": [ "
\n", "Click to reveal solution\n", "\n", "\n", "> Another quick peek at the documentation and we find that the `.count()` method does exactly what we desire.\n", "> ```python\n", "> menu_item_list.count('spam')\n", "> ```\n", "
" ] }, { "cell_type": "markdown", "metadata": { "id": "tifq5zbVSsUM" }, "source": [ "---\n", "\n", "## Dictionaries\n", "\n", "What if you want to keep a record of students in a class, their class year, their exam scores, and whether they are a graduate or an undergraduate student?\n", "\n", "This is where dictionaries come in handy. They allow us to store related data with labels, also known as **key-value** pairs. \n", "Dictionaries are denoted with curly braces **`{ }`** in python.\n", "\n", "More information on dictionaries: " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "org4r1ZeSsUM" }, "outputs": [], "source": [ "# example of a dictionary\n", "student1 = {'first': 'John',\n", " 'last': 'Smith',\n", " 'year': 2022,\n", " 'graduate': True,\n", " 'scores': [98, 94, 95, 86]}\n", "student1" ] }, { "cell_type": "markdown", "metadata": { "id": "GMnFe5pZSsUM" }, "source": [ "To access data from a dictionary, you use square brackets **`[ ]`** with the **key**: `dict[key]`" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "8Ew_7CpbSsUM" }, "outputs": [], "source": [ "student1['first']" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "Yrp-LsECSsUM" }, "outputs": [], "source": [ "student1['scores']" ] }, { "cell_type": "markdown", "metadata": { "id": "GRVknO30SsUM" }, "source": [ "If the elements of a dictionary are also *iterables* (like lists), you can use **chained indexing** to access nested elements. \n", "*This also applies to other iterables like lists or arrays.*" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "Ia6tzlb3SsUM" }, "outputs": [], "source": [ "# let's say we want to know what grade John got on the third exam\n", "student1['scores'][2]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "FdhcjenySsUM" }, "outputs": [], "source": [ "# here are some more students\n", "student2 = {'first': 'Mary', 'last': 'Johnson', 'year': 2024, 'graduate': False, 'scores': [89, 92, 96, 82]}\n", "student3 = {'first': 'Robert', 'last': 'Williams', 'year': 2022, 'graduate': False, 'scores': [88, 72, 64, 91]}\n", "student4 = {'first': 'Jennifer', 'last': 'Jones', 'year': 2023, 'graduate': True, 'scores': [92, 91, 94, 99]}" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "1fuvgtP5SsUM" }, "outputs": [], "source": [ "# we can store our students in a list\n", "students = [student1, student2, student3, student4]\n", "students" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "kZVBUV-JSsUM" }, "outputs": [], "source": [ "# what is the first name of the third student in the list?\n", "students[2]['first']" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "XU3k8Hf9SsUM" }, "outputs": [], "source": [ "# what score did Robert get on his third exam?\n", "students[2]['scores'][2]" ] }, { "cell_type": "markdown", "metadata": { "id": "UZVQTuv5SsUM" }, "source": [ "We can also construct a dictionary iteratively. Using `dict[key] = value` with a ***key*** that doesn't exist will automatically add a new **key-value** pair into the dictionary." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "8VpIszb7SsUM" }, "outputs": [], "source": [ "student = {}\n", "student['first'] = 'Linda'\n", "student['last'] = 'Wilson'\n", "student['year'] = 2025\n", "student['graduate'] = False\n", "student['scores'] = [84, 92, 89, 94]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "t0HD8qF4SsUN" }, "outputs": [], "source": [ "student" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "P4xC1mO_SsUN" }, "outputs": [], "source": [ "# let's add this student to our list of students\n", "students.append(student)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "tquMsVO0SsUN" }, "outputs": [], "source": [ "students" ] }, { "cell_type": "markdown", "metadata": { "id": "kyUA6FvESsUN" }, "source": [ "---\n", "\n", "### Exercises involving Dictionaries\n", "\n", "Create a new student record using the first four elements from our previous list `scores`. \n", "Make sure to include all of the same fields as previous student records. Make up the values for other fields except *scores*.\n", "\n", "Finally, add the newly create student to the list `students` (and verify it's there).\n", "\n", "*__Hint:__ Remember that you can use slicing to extract a range of elements from a list.*" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "aunrAKtlSsUN" }, "outputs": [], "source": [ "# answer here\n" ] }, { "cell_type": "markdown", "metadata": { "id": "OVQj-RC0SsUN" }, "source": [ "
\n", "Click to reveal solution\n", "\n", "\n", "> One option would be to create a new variable like above and then add it to the list of dictionaries. \n", "> Or, instead, we could both create a new dictionary and add it to our list all in one line.\n", "> ```python\n", "> students.append({'first': 'Monty',\n", "> 'last': 'Python',\n", "> 'year': 2025,\n", "> 'graduate': True,\n", "> 'scores': scores[:4]})\n", "> ```\n", "\n", "> Remember that you can omit the zero when slicing from the beginning of a list, hence `scores[:4]`. \n", "> As this does not output anything, you would need to call `students` or use `print()` to see if it actually worked.\n", "
" ] }, { "cell_type": "markdown", "metadata": { "id": "cvIx6iA0SsUN" }, "source": [ "---\n", "\n", "## Functions\n", "\n", "**Functions** take an input, do something with it, and provide an output." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "w7WeeZLUSsUN" }, "outputs": [], "source": [ "# example of a function that takes a number and output its square\n", "def square(x):\n", " sq = x * x\n", " return sq" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "qYMe3uIDSsUN" }, "outputs": [], "source": [ "square(5)" ] }, { "cell_type": "markdown", "metadata": { "id": "nTvdHq_mSsUN" }, "source": [ "Note that any variables defined within a function definition are temporary and only exist within that function. \n", "For example, the variable `sq` does not exist, even though we just used it when we called `square()`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "s8J2QrS5SsUN" }, "outputs": [], "source": [ "# attempting to refer to sq will gives us an error because it is not defined\n", "sq" ] }, { "cell_type": "markdown", "metadata": { "id": "VyJd-7fHSsUN" }, "source": [ "We must use the `return` statement to make the function output any desired values. \n", "Note that the whole expression to the right of the `return` statement gets evaluated first, and then the result of the whole expression gets outputted." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "-AMOllmASsUN" }, "outputs": [], "source": [ "# knowing this we can rewrite our function\n", "def square(x):\n", " return x * x" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "QLHx0Je8SsUN" }, "outputs": [], "source": [ "square(3)" ] }, { "cell_type": "markdown", "metadata": { "id": "tfOpQQzLSsUN" }, "source": [ "However, functions do not need to take input nor do they need to return anything." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "jRMJowIUSsUN" }, "outputs": [], "source": [ "# example of a function that does not take input or return anything\n", "def print_spam():\n", " print('spam')" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "BSOUNGh1SsUN" }, "outputs": [], "source": [ "print_spam()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "Jzh5IlCOSsUN" }, "outputs": [], "source": [ "returned_value = print_spam()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "KIJ3g3IVSsUN" }, "outputs": [], "source": [ "print(returned_value)" ] }, { "cell_type": "markdown", "metadata": { "id": "qVTg6G42SsUN" }, "source": [ "Note that printing and returning are not the same:\n", "- `print()` is used to display information intended for humans on the computer screen\n", "- `return` is used for a function to output a value that can be stored in a variable or used elsewhere" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "Qu0YwT7HSsUN" }, "outputs": [], "source": [ "# note that printing and returning are not the same things\n", "# this function does not take input, but still returns something\n", "def return_spam():\n", " return 'spam'" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "M_Z938NaSsUN" }, "outputs": [], "source": [ "returned_value = return_spam()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "syQNcgD0SsUN" }, "outputs": [], "source": [ "print(returned_value)" ] }, { "cell_type": "markdown", "metadata": { "id": "vSJLJuUbSsUN" }, "source": [ "What happens if you call `square()` on a list?" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "dE6RiQkoSsUO" }, "outputs": [], "source": [ "square(scores)" ] }, { "cell_type": "markdown", "metadata": { "id": "rBlr0EvhSsUO" }, "source": [ "Because you do not need to specify the types of variables in Python, it is very important to think about the **type** of the input when using or defining functions.\n", "\n", "---\n", "\n", "### Exercises involving Functions\n", "Write a function that adds 42 to whatever number you give it." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "iwpb3ExhSsUO" }, "outputs": [], "source": [ "# answer here\n" ] }, { "cell_type": "markdown", "metadata": { "id": "hIL6FSFSSsUO" }, "source": [ "
\n", "Click to reveal solution\n", "\n", "\n", "> Using the `square` function from above as an example, we can come up with something like this.\n", "> ```python\n", "> def add_forty_two(number):\n", "> output = number + 42\n", "> return output\n", "> ```\n", "\n", "> You should always give your functions and variables short yet descriptive names. \n", "> Note that the variable `output` in the function definition above is redundant and can be omitted.\n", "> ```python\n", "> def add_forty_two(number):\n", "> return number + 42\n", "> ```\n", "\n", "> Also note that there is never a single right answer to these exercises. Your function might look different, but still work. \n", "> You should always test your function with various input and ensure the outputs are as expected.\n", "
" ] }, { "cell_type": "markdown", "metadata": { "id": "NTnq2pwnSsUO" }, "source": [ "Write a function with the following characteristics: \n", "***Input:*** A word (string). \n", "***Output:*** The same word but with `\" and spam\"` added to the end of it." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "JWcDWPV7SsUO" }, "outputs": [], "source": [ "# answer here\n" ] }, { "cell_type": "markdown", "metadata": { "id": "VSiuWn5cSsUO" }, "source": [ "
\n", "Click to reveal solution\n", "\n", "\n", "> Very similar to the function above, except this time we use string arithmetic.\n", "> ```python\n", "> def append_spam(word):\n", "> return word + ' and spam'\n", "> ```\n", "\n", "> Note that you should always use `return` to emit output from a function. \n", "> Use `print()` only when the purpose of your function is actually to display content on the screen.\n", "
" ] }, { "cell_type": "markdown", "metadata": { "id": "Y_CTxnJASsUO" }, "source": [ "Write a function with the following characteristics: \n", "***Input:*** Two numbers: the base and the exponent. \n", "***Output:*** The first number (base) to the power of the second number (exponent).\n", "\n", "*__Bonus:__ Use a named argument and make the function square the base by default. \n", "Here is a reference to help out with that: *" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "L6yKP4K9SsUO" }, "outputs": [], "source": [ "# answer here\n" ] }, { "cell_type": "markdown", "metadata": { "id": "nhJs--wvSsUO" }, "source": [ "
\n", "Click to reveal solution\n", "\n", "\n", "> A basic solution without using any named arguments would be as follows:\n", "> ```python\n", "> def power(base, exponent):\n", "> return base ** exponent\n", "> ```\n", "\n", "> Alternatively, we could make `exponent` a named argument and give it a default value of two.\n", "> ```python\n", "> def power(base, exponent=2):\n", "> return base ** exponent\n", "> ```\n", "\n", "> Now you can omit `exponent` completely and the function will automatically square the `base`. \n", "> Also note that when defining function arguments like this, you can use them both as positional and named arguments when calling the function. \n", "> You can try this out by copying and running the code from below. Note all the different ways you can call functions.\n", "> ```python\n", "> # omitting the exponent will square the number given\n", "> print(power(2))\n", ">\n", "> # you can use the function with positional arguments as you normally would\n", "> print(power(2, 3))\n", ">\n", "> # for better clarity you can refer to the exponent argument by name\n", "> print(power(2, exponent=4))\n", ">\n", "> # you can also refer to all the arguments by name\n", "> print(power(base=2, exponent=5))\n", ">\n", "> # you are free to change the order of the arguments when referring to all of them by name\n", "> print(power(exponent=6, base=2))\n", "> ```\n", "
" ] }, { "cell_type": "markdown", "metadata": { "id": "OsBV80ZgSsUO" }, "source": [ "Write a function with the following characteristics: \n", "***Input:*** A string and a number. \n", "***Output:*** The same string except the first and last letters (characters) have been duplicated by the specified number of times." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "1O-HzhcuSsUO" }, "outputs": [], "source": [ "# answer here\n" ] }, { "cell_type": "markdown", "metadata": { "id": "j8t2T4gvSsUO" }, "source": [ "
\n", "Click to reveal solution\n", "\n", "\n", "> By combining string arithmetic and slicing, we can write a quick one-liner.\n", "> ```python\n", "> def enbiggen_word(word, x):\n", "> return word[0] * x + word[1:-1] + word[-1] * x\n", "> ```\n", ">\\\n", "> Remember that the last element of a list (or the last character of a string) is at index `-1`. \n", "> Again, your function could look different and span across multiple lines, but still work. This is just a compact example.\n", "
" ] }, { "cell_type": "markdown", "metadata": { "id": "2KMLkd1KSsUO" }, "source": [ "---\n", "\n", "## Loops\n", "\n", "**For** loops allow us to iterate over elements of a list." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "latDdZE0SsUO" }, "outputs": [], "source": [ "# let's say we want to print out each of my scores one by one\n", "\n", "for score in scores:\n", " print(score)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "zUziQXGsSsUO" }, "outputs": [], "source": [ "# some of the low scores make us really sad\n", "# let's create an illusion of happiness by adding five points to each grade\n", "\n", "for score in scores:\n", " score = score + 5\n", " print(score)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "jOL-rNAeSsUO" }, "outputs": [], "source": [ "# did that change our original scores?\n", "scores" ] }, { "cell_type": "markdown", "metadata": { "id": "_n6JIbWQSsUO" }, "source": [ "Note that the `score` in `for score in scores` is a temporary variable that only exists within the loop and contains a ***copy*** of the element in `scores` that corresponds to the current iteration. Because it is a **copy**, any modifications to it will not be represented in the original list `scores`. If we want to modify elements of a list within a loop, we have to iterate over the *indices* of the list and then use `scores[index]` to access the element within the loop.\n", "\n", "**While** loops allow us to repeat a code segment as long as a certain condition is met." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "CAXf5EbNSsUO" }, "outputs": [], "source": [ "# there really are not any good examples of while loops at the beginner level\n", "# so here is one that just prints \"spam\" ten times\n", "\n", "count = 0 # how many times we have printed spam\n", "\n", "while count < 10:\n", " print('spam')\n", " count = count + 1 # we must update the count to eventually exit the loop\n", "\n", "# we could also achieve this using a for loop on a random 10-element list" ] }, { "cell_type": "markdown", "metadata": { "id": "dNjnk4wzSsUO" }, "source": [ "In Python `range()` is often used as an alternative to while loops or to iterate over the indices of a list." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "GieCaoO4SsUO" }, "outputs": [], "source": [ "# we can use range to do something a specified number of times\n", "for i in range(10):\n", " print('spam')" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "1JuzEN5KSsUO" }, "outputs": [], "source": [ "# or we can use it to iterate over the indices of a list, as opposed to the elements themselves\n", "for i in range(len(scores)):\n", " print('the score at index', i, 'is', scores[i])" ] }, { "cell_type": "markdown", "metadata": { "id": "NAcGAzr7SsUO" }, "source": [ "However, when we want to access **both** every element **and** their respective index number, `enumerate()` is a much better option." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "dZKlEIH9SsUP" }, "outputs": [], "source": [ "for i, score in enumerate(scores):\n", " print('the score at index', i, 'is', score)" ] }, { "cell_type": "markdown", "metadata": { "id": "_0Lh8jjaSsUP" }, "source": [ "---\n", "\n", "### Sidenote: Tuples\n", "\n", "Why did we need to use `i, scores` when we used `enumerate(grades)`? \n", "That's because `enumerate()` returns a list of tuples. \n", "A **tuple** is an ordered ***unchangeable*** collection of elements." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "wQJ6_BHmSsUP" }, "outputs": [], "source": [ "# example of a tuple\n", "triple = (1, 2, 3)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "5uLM1jYGSsUP" }, "outputs": [], "source": [ "# we can use indexing to access tuple elements\n", "triple[1]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "sBp4kQzpSsUP" }, "outputs": [], "source": [ "# we can also easily extract all elements from a tuple\n", "a, b, c = triple\n", "print(a, b, c)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "OypMUmZ5SsUP" }, "outputs": [], "source": [ "# we can also ignore elements we are not interested in\n", "_, second, _ = triple\n", "second" ] }, { "cell_type": "markdown", "metadata": { "id": "Bq9FNS9mSsUP" }, "source": [ "---\n", "\n", "### Exercises involving Loops\n", "Use a **for-loop** to write a function with the following characteristics: \n", "***Input:*** A list of numbers. \n", "***Output:*** A copy of the input where the number 42 has been added to each element.\n", "\n", "*If nothing comes to mind at first, these hints might be of help:*\n", "1. *Remember that simply modifying a list element in a for-loop does not change the original list.*\n", "2. *Empty square brackets __`[ ]`__ denote an empty list.*\n", "3. *You might want to use the `.append()` method and a function from a previous challenge.*" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "PgDBKGTESsUP" }, "outputs": [], "source": [ "# answer here\n" ] }, { "cell_type": "markdown", "metadata": { "id": "RWKdpy2YSsUP" }, "source": [ "
\n", "Click to reveal solution\n", "\n", "\n", "> ```python\n", "> def new_list_by_adding_42(xs):\n", "> output = [] # create a new list to output\n", "> for x in xs: # for every element x in the list xs\n", "> output.append(x + 42) # add 42 to x and store the result in the list output\n", "> return output # return the final output list once all elements of xs processed\n", "> ```\n", "
" ] }, { "cell_type": "markdown", "metadata": { "id": "A7k7TntRSsUP" }, "source": [ "Use a **loop** to write a function with the following characteristics: \n", "***Input:*** A number. \n", "***Output:*** The word `\"spam\"` printed out said number of times, each on a new line." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "x9IqtL77SsUP" }, "outputs": [], "source": [ "# answer here\n" ] }, { "cell_type": "markdown", "metadata": { "id": "XY5e4uxVSsUP" }, "source": [ "
\n", "Click to reveal solution\n", "\n", "\n", "> One option would be to use `while` like in the example above.\n", "> ```python\n", "> def print_many_spam(n):\n", "> count = 0 # to keep track of how many times we have printed spam\n", "> while count < n: # keep printing until count reaches specified number\n", "> print('spam') # using return instead of print would exit the function\n", "> count = count + 1 # update count to eventually exit loop\n", "> ```\n", "\n", "> An easier and more Python-esque way of doing this would be to use `range()` and `for` instead.\n", "> ```python\n", "> def print_many_spam(n):\n", "> for _ in range(n):\n", "> print('spam')\n", "> ```\n", "\n", "> Remember that we can use the underscore `_` to denote variables we do not care about. \n", "> When using `range()` in this context, we are not interested in the actual elements of the list.\n", "
" ] }, { "cell_type": "markdown", "metadata": { "id": "-2nauETWSsUP" }, "source": [ "Use `range()` to write a function that takes a list of numbers as an input, but does not output anything. \n", "Instead, it modifies the input list such that 42 gets added to each element.\n", "\n", "You might be tempted use `scores` to test this function. **Do not do that!** Instead, use the provided **copy** if desired. \n", "*We will talk about the importance of the `.copy()` method in a little bit.*" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "LWEUtgIdSsUP" }, "outputs": [], "source": [ "# create a copy of scores for testing purposes\n", "scores_copy = scores.copy()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ET9GK-ddSsUP" }, "outputs": [], "source": [ "# answer here\n" ] }, { "cell_type": "markdown", "metadata": { "id": "F04ubX-4SsUP" }, "source": [ "
\n", "Click to reveal solution\n", "\n", "\n", "> To modify the elements of a list, we need to loop over the indices of the list and refer to each element via its index. \n", "> We can use `range()` with `len()` to generate a list of index numbers to use.\n", "> ```python\n", "> def modify_list_by_adding_42(xs):\n", "> for i in range(len(xs)):\n", "> xs[i] = xs[i] + 42 \n", "> ```\n", "\n", "> Alternatively we could use `enumerate()`. Study the example below and see how it behaves the same.\n", "> ```python\n", "> def modify_list_by_adding_42(xs):\n", "> for i, x in enumerate(xs):\n", "> xs[i] = x + 42\n", "> ```\n", "\n", "> Note that because these functions modify the list provided as input, they do not output anything. \n", "> You should take a look at the list provided as input to see if they actually worked.\n", "
" ] }, { "cell_type": "markdown", "metadata": { "id": "o4Rwfu_wSsUP" }, "source": [ "Remember the dictionary `students` from before?\\\n", "Write a function that curves the scores of all the students in the dictionary by adding five points to each score." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "2uKpBsMaSsUP" }, "outputs": [], "source": [ "# this is the dictionary in case you do not remember\n", "students" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "BGXTJkreSsUP" }, "outputs": [], "source": [ "# answer here\n" ] }, { "cell_type": "markdown", "metadata": { "id": "V4amorVqSsUP" }, "source": [ "
\n", "Click to reveal solution\n", "\n", "\n", "> After some trial and error or a quick web search, you learn that using a for-loop to iterate over dictionary entries actually gives you both read and write access. \n", "> However, because the grades themselves are stored in a list, we must use `range()` or `enumerate()` to change them.\n", "> ```python\n", "> def curve_scores(students):\n", "> for student in students:\n", "> for i, score in enumerate(student['scores']):\n", "> student['scores'][i] = score + 5\n", "> ```\n", "
" ] }, { "cell_type": "markdown", "metadata": { "id": "-FBMsoBESsUP" }, "source": [ "---\n", "\n", "## Advanced: Alternatives to Loops\n", "\n", "### Lambda Functions and Mapping\n", "\n", "Remember the function `square()` from before? \n", "Such a simple function can easily be written as a quick one-liner." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "sBYlLtKSSsUP" }, "outputs": [], "source": [ "square = lambda x : x * x" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "VZ-xd-lzSsUP" }, "outputs": [], "source": [ "square(4)" ] }, { "cell_type": "markdown", "metadata": { "id": "-Cl3_Du8SsUP" }, "source": [ "These one-line functions called **lambda** functions, but also often referred to as *anonymous* functions. \n", "That's because they do not have a name. You can easily define a lambda function as follows, without ever having to specify a function name.\n", "\n", "`lambda arguments : expression`\n", "\n", "Note how you do not need to use `return`. The result of the one-line expression gets automatically outputted. \n", "If desired, you can store a lambda function definition into a variable and use it as you would a normal function.\n", "\n", "However, the true power of lambda functions are revealed when dealing with functions that expect other functions as input. One of these functions is the built-in `map()` function. It takes a function and an iterable (list), and returns a new iterable where the function has been applied to each element. It is kind of like using a `for` loop, except it is much more efficient and often the preferred alternative to using a for loop for simple element-wise operations." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "JARrEEPNSsUQ" }, "outputs": [], "source": [ "# let's square all of the scores\n", "map(square, scores)" ] }, { "cell_type": "markdown", "metadata": { "id": "dZnm0n7iSsUQ" }, "source": [ "By default `map()` just creates a recipe on how to construct our new iterable or list. It will not actually create a new iterable unless explicitly asked to do so." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "6wUZQo1xSsUQ" }, "outputs": [], "source": [ "list(map(square, scores))" ] }, { "cell_type": "markdown", "metadata": { "id": "XEW7SVcTSsUQ" }, "source": [ "But what if we want to cube all of the scores?\\\n", "We do not have to define a new function for that, we can just use `lambda`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "rYEUaUdBSsUQ" }, "outputs": [], "source": [ "list(map(lambda x : x ** 3, scores))" ] }, { "cell_type": "markdown", "metadata": { "id": "yFXzJ6oNSsUQ" }, "source": [ "*More information on lambda functions in Python: *\n", "\n", "---\n", "\n", "### List Comprehension\n", "\n", "While the concepts of using anonymous functions and mapping over elements of an iterable are commonplace in other programming languages and very useful in Python when working with more complex data structures and using common data science libraries like NumPy and Pandas, they are not commonly used by Python programmers when working with lists. That is because Python has something called **list comprehensions**, which are a simpler and more *pythonic* alternative to using `lambda` and `map()`. This is what a general structure of a list comprehension looks like:\n", "\n", "`[expression for element in iterable]`\n", "\n", "Or, to use some dummy variables and functions:\n", "\n", "`[fun(x) for x in xs]`\n", "\n", "Using list comprehension, we can easily cube all of the scores as follows." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "foRdGy5uSsUQ" }, "outputs": [], "source": [ "[score ** 3 for score in scores]" ] }, { "cell_type": "markdown", "metadata": { "id": "K6v4mIldSsUQ" }, "source": [ "Note how list comprehension automatically creates a list. Also note how we could use whatever variable name we want, just like in a loop.\n", "\n", "We can use both expressions and predefined functions with list comprehensions. For example, the `square()` function from before could be used to square all of our scores." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "JLRdXiLUSsUQ" }, "outputs": [], "source": [ "[square(x) for x in scores]" ] }, { "cell_type": "markdown", "metadata": { "id": "IYvh2GA7SsUQ" }, "source": [ "You can also combine list comprehension with other list generators to create specific lists. For example, we could easily get a list of the first ten squares as follows." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "pmdNcR2DSsUQ" }, "outputs": [], "source": [ "[x ** 2 for x in range(10)]" ] }, { "cell_type": "markdown", "metadata": { "id": "SU4pfkS_SsUQ" }, "source": [ "One potential drawback when using list comprehensions is that it always generates a list. Every single element of the list gets calculated, even if you do not actually care about individual elements. When working with big data, this could be a problem as it would waste resources and slow down your processing. To overcome this, you could create a **generator** instead.\n", "\n", "For example, we could use a generator to get the sum of the first 10,000 squares." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "WhX6wv_sSsUQ" }, "outputs": [], "source": [ "squares = (x ** 2 for x in range(10000))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "FbNOrAnoSsUQ" }, "outputs": [], "source": [ "squares" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "4y_poLI9SsUQ" }, "outputs": [], "source": [ "sum(squares)" ] }, { "cell_type": "markdown", "metadata": { "id": "1V5-d7FPSsUQ" }, "source": [ "---\n", "\n", "### Exercises involving Map, Lambda, and List Comprehensions\n", "\n", "Use `map()` and a `lambda` function to rewrite this function from before: \n", "***Input:*** A list of numbers. \n", "***Output:*** A copy of the input where the number 42 has been added to each element." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "k652aSg4SsUQ" }, "outputs": [], "source": [ "# answer here\n" ] }, { "cell_type": "markdown", "metadata": { "id": "QgUv9l9BSsUQ" }, "source": [ "
\n", "Click to reveal solution\n", "\n", "\n", "> ```python\n", "> def new_list_by_adding_42(xs):\n", "> return list(map(lambda x : x + 42, xs))\n", "> ```\n", "
" ] }, { "cell_type": "markdown", "metadata": { "id": "Gyv7pivHSsUQ" }, "source": [ "Now do the same using **list comprehension** instead." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "RhSIioCmSsUQ" }, "outputs": [], "source": [ "# answer here\n" ] }, { "cell_type": "markdown", "metadata": { "id": "IcSq3vLnSsUQ" }, "source": [ "
\n", "Click to reveal solution\n", "\n", "\n", "> ```python\n", "> def new_list_by_adding_42(xs):\n", "> return [x + 42 for x in xs]\n", "> ```\n", "
" ] }, { "cell_type": "markdown", "metadata": { "id": "K7VmpL01SsUQ" }, "source": [ "---\n", "\n", "## Conditionals\n", "\n", "**If ... else** blocks can be used to run different segments of code depending on specified criteria." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "5DJ6mHCYSsUQ" }, "outputs": [], "source": [ "# we would like to know if an exam score corresponds to the letter grade A\n", "# we can write a function that congratulates us if we have an A\n", "\n", "def check_if_A(score):\n", " if score >= 90:\n", " print('Score', score, 'corresponds to an A.')\n", " print('Congratulations!')\n", " else:\n", " print('Score', score, 'does not correspond to an A.')\n", " print('Sorry...')" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "zew0xoZXSsUQ" }, "outputs": [], "source": [ "check_if_A(91)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "rsHzOYNNSsUQ" }, "outputs": [], "source": [ "check_if_A(89)" ] }, { "cell_type": "markdown", "metadata": { "id": "mv9YwTgtSsUR" }, "source": [ "We can also omit the `else` if desired, in that case the code does not get run if the specified condition is not met." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "J85WYZu8SsUR" }, "outputs": [], "source": [ "def yay_iff_A(score):\n", " if score >= 90:\n", " print('YAY!')" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "EYexP4lmSsUR" }, "outputs": [], "source": [ "yay_iff_A(91)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "G10OPfmUSsUR" }, "outputs": [], "source": [ "yay_iff_A(89)" ] }, { "cell_type": "markdown", "metadata": { "id": "7c72i2aESsUR" }, "source": [ "We can also stack conditionals using `elif` to provide more functionality. \n", "The conditional statements get evaluated in order, from top to bottom." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "oybUQczsSsUR" }, "outputs": [], "source": [ "# knowing the letter grade ranges, we can write a function that maps a score to a letter grade\n", "def get_letter_grade(score):\n", " if score < 60:\n", " return 'F'\n", " elif score < 70:\n", " return 'D'\n", " elif score < 80:\n", " return 'C'\n", " elif score < 90:\n", " return 'B'\n", " else:\n", " return 'A'" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "StMn-xEHSsUR" }, "outputs": [], "source": [ "# Now we can use a loop to get a list of letter grades from our scores\n", "\n", "letter_grades = []\n", "\n", "for score in scores:\n", " letter_grades.append(get_letter_grade(score))\n", "\n", "letter_grades" ] }, { "cell_type": "markdown", "metadata": { "id": "toH6SuiKSsUR" }, "source": [ "---\n", "### Exercises involving Conditionals\n", "Use a **conditional** and a **for-loop** to write a function with the following characteristics: \n", "***Input:*** A list of exam scores. \n", "***Output:*** A new list containing only the exam scores that correspond to the letter grade **A**, in order.\n", "\n", "*If nothing comes to mind at first, these hints might be of help:*\n", "1. *Feel free to use the function we just created and your code from a previous challenge.*\n", "2. *Use `==` to check for equality. This also works for strings.*" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "QG_8cBB6SsUR" }, "outputs": [], "source": [ "# answer here\n" ] }, { "cell_type": "markdown", "metadata": { "id": "uwkgNFOASsUR" }, "source": [ "
\n", "Click to reveal solution\n", "\n", "\n", "> One option would be to recycle the `get_letter_grade()` function from before.\n", "> ```python\n", "> def extract_good_scores(scores):\n", "> output = []\n", "> for score in scores:\n", "> if (get_letter_grade(score) == 'A'):\n", "> output.append(score)\n", "> return output\n", "> ```\n", "\n", "> However, because we only care about a very small fraction of the scores, it makes more sense to just check the value of the score directly.\n", "> ```python\n", "> def extract_good_scores(scores):\n", "> output = []\n", "> for score in scores:\n", "> if (score >= 90):\n", "> output.append(score)\n", "> return output\n", "> ```\n", "
" ] }, { "cell_type": "markdown", "metadata": { "id": "XtCG-FJ-SsUR" }, "source": [ "---\n", "\n", "## Advanced: Copy vs View\n", "What if we would like create a copy of a list? \n", "Our first thought would be to use the assignment operator `=` as we would with any variable." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "qGBzYUbISsUR" }, "outputs": [], "source": [ "numbers = [5, 4, 3, 2, 1]\n", "numbers_copy = numbers" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "q2ZuCT7USsUR" }, "outputs": [], "source": [ "numbers_copy" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "O35x5n2QSsUR" }, "outputs": [], "source": [ "numbers" ] }, { "cell_type": "markdown", "metadata": { "id": "0AD-kRSrSsUR" }, "source": [ "Both lists appear identical, so it looks like it worked. \n", "But what if we make some modifications to the copy?" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "dfxLYQD7SsUR" }, "outputs": [], "source": [ "numbers_copy[0] = -1" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "5rcY299LSsUR" }, "outputs": [], "source": [ "numbers_copy" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "PvH1erJBSsUR" }, "outputs": [], "source": [ "numbers" ] }, { "cell_type": "markdown", "metadata": { "id": "-k-9a_4qSsUR" }, "source": [ "Even though we only changed `numbers_copy`, the original list `numbers` also changed. \n", "That is because Python optimized in the background and did not give us a **copy** of the list. \n", "Instead, it took a shortcut and gave us a **view**, meaning that the names `numbers_copy` and `numbers` actually refer to the same object. \n", "This kind of behavior is often referred to as ***pass-by-reference***. While the behavior where a **copy** is returned instead is known as ***pass-by-value***.\n", "\n", "Simple (immutable) data structures like numbers are passed by ***value*** in Python, meaning that we always get a **copy**." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "MJ8zyc6hSsUR" }, "outputs": [], "source": [ "number = 42\n", "number_copy = number" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "P8rPT_eTSsUR" }, "outputs": [], "source": [ "number_copy" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "7GagQV8PSsUR" }, "outputs": [], "source": [ "number" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "wH9nmCLeSsUS" }, "outputs": [], "source": [ "number_copy = number_copy + 1" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "uSrLxqH7SsUS" }, "outputs": [], "source": [ "number_copy" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "8YDyJLJmSsUS" }, "outputs": [], "source": [ "number" ] }, { "cell_type": "markdown", "metadata": { "id": "bWDv8iwdSsUS" }, "source": [ "However, when it comes to more complex (mutable) data structures like lists, Python almost always attempts to optimize and uses ***pass-by-reference***. \n", "Hence, when working with iterables and other complex data structures, it is important to keep in mind that Python returns a **view** unless explicitly asked to do otherwise.\n", "\n", "To ensure we get a **copy**, we should use the `.copy()` method." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "0nc0ofClSsUS" }, "outputs": [], "source": [ "numbers = [5, 4, 3, 2, 1]\n", "numbers_copy = numbers.copy()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "OYBu417tSsUS" }, "outputs": [], "source": [ "numbers_copy" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "yyZmvvE7SsUS" }, "outputs": [], "source": [ "numbers" ] }, { "cell_type": "markdown", "metadata": { "id": "J9S0qnclSsUS" }, "source": [ "Now `numbers` and `numbers_copy` are actually two different lists, so we can modify one without changing the other." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "kqLAwUKLSsUS" }, "outputs": [], "source": [ "numbers_copy[0] = -1" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "7ZV4mC0PSsUS" }, "outputs": [], "source": [ "numbers_copy" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "8xwyrnH3SsUS" }, "outputs": [], "source": [ "numbers" ] }, { "cell_type": "markdown", "metadata": { "id": "W_n2SRdISsUS" }, "source": [ "It is crucial to keep this behavior in mind when working with more advanced packages like *Pandas* or *NumPy*. Because Python might or might not choose to be clever and optimize in the background, you can never be completely sure whether you are dealing with a **view** or a **copy**. In most cases, the default behavior is to return a **view**, unless explicitly asked to do otherwise. However, this might differ from package to package or even function to function.\n", "\n", "Hence, **always read the documentation** and use the appropriate function to ensure you know whether you are dealing with a **view** or **copy**.\n", "\n", "If you would like to learn more about when Python uses *pass-by-value* and when it uses *pass-by-reference*, check out these resources:\n", "\n", "- [Is Python *Pass-by-Reference* or *Pass-by-Value*?](https://robertheaton.com/2014/02/09/pythons-pass-by-object-reference-as-explained-by-philip-k-dick/)\n", "- [Understanding Python Variables and Memory Management](http://foobarnbaz.com/2012/07/08/understanding-python-variables/)" ] }, { "cell_type": "markdown", "metadata": { "id": "_tiKZhmKSsUS" }, "source": [ "---\n", "\n", "## Working with Libraries\n", "\n", "This is how you import modules, packages, or libraries." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "poYCtwA0SsUS" }, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "metadata": { "id": "M9VUnrgHSsUS" }, "source": [ "Let us experiment with some **NumPy** functions." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "R9T-ePikSsUS" }, "outputs": [], "source": [ "# create an array and fill it with evenly spaced numbers from 0 to 2pi\n", "x = np.linspace(0, 2*np.pi, num=100, dtype=float)" ] }, { "cell_type": "markdown", "metadata": { "id": "oxJsv8hISsUS" }, "source": [ "What are `num` and `dytpe`?\n", "\n", "Use `help()` or check out the documentation online: \n", "" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "JM1ZKMLiSsUS" }, "outputs": [], "source": [ "help(np.linspace)" ] }, { "cell_type": "markdown", "metadata": { "id": "MxZ5t0-2SsUS" }, "source": [ "NumPy arrays have an attribute `.size` that equals the number of elements." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ozkoXDSGSsUS" }, "outputs": [], "source": [ "x.size" ] }, { "cell_type": "markdown", "metadata": { "id": "AC0JMBq_SsUS" }, "source": [ "Let's calculate the sine of `x` and plot it using **Matplotlib**." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "OIV4Ja9MSsUS" }, "outputs": [], "source": [ "y = np.sin(x)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "jE9_LrhxSsUS" }, "outputs": [], "source": [ "# define the plot\n", "plt.plot(x, y, '*-')\n", "\n", "# add labels\n", "plt.xlabel('x')\n", "plt.ylabel('y')\n", "\n", "# show the plot\n", "plt.show()\n", "\n", "# the '*-' is just a shorthand for a line graph with tick marks" ] }, { "cell_type": "markdown", "metadata": { "id": "knxMF1MbSsUS" }, "source": [ "---\n", "\n", "## Additional Resources\n", "\n", "- Kaggle Python Course: \n", "- W3Schools Python Tutorial: \n", "- Google's Python Class: \n", "- Official Python Tutorial: \n", "\n", "Software Carpentry Lessons\n", "- Programming with Python: \n", "- Plotting and Programming in Python: " ] } ], "metadata": { "colab": { "provenance": [] }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3" } }, "nbformat": 4, "nbformat_minor": 0 }