Introduce the course
Using Slack & passing code
Discuss all the things you need to install and how to do it.
Take a tour of the various Integrated Development Environments (IDE)s we’ll come in contact with in this class.
reticulate
Using Markdown and Latex in Jupyter Notebooks and/or RMarkdown
magic
commands when using a Jupyter Kernel (Jupyter Notebook/Atom + Hydrogen)
Cover some commandline basics
Throughout the semester, I will use the commandline and many different IDEs when coding in Python or using Git. Here is an overview of these tools with instructions on how to install them. (These materials are also on the main course website).
At times, we’ll use a unix-based commandline. The commandline will feature into our discussion on using git
and also running Python programs. If you use a Mac or a Linux operating system, then a functioning commandline comes with your operating system. For Apple machines, this is the Terminal.
For Windows (specifically Windows 10), you can enable Linux Bash shell. The following offers a tutorial on how to do this. If you’re using a version of Windows that pre-dates version 10, then Git Bash offers a program will allow you to use git
commands from your windows machine.
Finally, you’ll notice that my terminal will have a slightly different look than the one on your machine. This is because I’m using “Oh My Zsh” which is open-source software that allows me to customize my commandline. The above link offers everything you’d need to installing Oh My Zsh on your machine.
We’ll use Python3 throughout this course. Below are instructions for downloading Python3 using commandline packages manager (Homebrew
for mac, Chocolatey
for windows).
An alternative way to install Python3 is to download an Anaconda distribution. The instructor will use pip
rather than conda
in the instruction for downloading Python modules. These are simply two ways of downloading and managing open-source software packages. Choose which ever works best for you.
Once you have Python3 on your computer, you can install a Jupyter Notebook. If you downloaded Python3 using Anaconda, then Jupyter Notebook comes with the distribution and requires no further installation on your part. If you install Python3 using Homebrew
/Chocolately
, you can install Jupyter notebook running the following code using your commandline.
You can then activate a Jupyter Notebook from the commandline by typing:
If you’ve installed Python using Anaconda, the distribution provides a click-able icon to fire up a Jupyter Notebook. The advantage of using the commandline, however, is that you can set the working directory prior to firing up a notebook. This will allow you to work within a specific project folder more easily.
hydrogen
Atom is a hack-able text editor built by Github. The following are instruction on how to install Atom on your machine.
Atom allows you to install open-source packages that provide additional functionality. The following packages will help you as you use Atom to program in Python. Of these, Hydrogen
is the most important. It’ll allow you to use a Jupyter kernel from within Atom to evaluate code.
├── Hydrogen@2.14.4
├── atom-beautify@0.33.4
├── atom-language-r@1.4.8
├── atom-material-syntax@1.0.8
├── atom-material-syntax-light@0.4.6
├── atom-material-ui@2.1.3
├── auto-update-packages@1.0.1
├── autocomplete-R@0.6.0
├── autocomplete-modules@2.3.0
├── autocomplete-python@1.16.0
├── color-picker@2.3.0
├── docblock-python@0.19.0
├── file-icons@2.1.42
├── fix-indent-on-paste@0.1.1
├── fold-comments@0.6.0
├── git-log@0.4.1
├── hey-pane@1.2.0
├── hydrogen-cell-separator@0.4.1
├── language-weave@0.7.2
├── minimap@4.29.9
├── pdf-view@0.72.0
├── platformio-ide-terminal@2.10.0
├── python-indent@1.2.6
├── reindent@1.5.0
├── simple-drag-drop-text@0.5.0
├── symbols-tree-view@0.14.0
└── wordcount@3.2.0
To install any one of these packages from the commandline, type:
There is also a dedicated package manager built into Atom which you can use to download and install new packages. Open Atom then Settings > Install
and type the package name.
reticulate
In your classes that are focused on using R
, RStudio
will be your main IDE. However, RStudio
isn’t just for R
. It can handle a number of different languages. We’ll Python in RStudio
using the reticulate
package. We’ll talk about some of the advantages for doing this in class, but for now, let’s cover installation.
To install RStudio
, download from the following link (make sure to scroll all the way to the bottom).
reticulate
is an R package that allows one run a Python REPL in the R console. In addition, it allows one to read in and use Python code, and pass data between R and and Python. The following provides instructions on installing reticulate
.
Note: If you have multiple versions of Python on your computer, reticulate can get confused with regard to which version it is referencing. The following article covers these issues. The best way to resolve this issue is by creating a .Rprofile
file that sends instructions regarding the specific version of Python you wish to use.
“By setting the value of the RETICULATE_PYTHON environment variable to a Python binary. Note that if you set this environment variable, then the specified version of Python will always be used (i.e. this is prescriptive rather than advisory). To set the value of RETICULATE_PYTHON, insert Sys.setenv(RETICULATE_PYTHON = PATH) into your project’s .Rprofile, where PATH is your preferred Python binary.”
Here is an overview of other text editors that are popular for programming in Python, which you won’t see featured in this course. Note I’m agnostic on whatever you use to learn Python and some find that different set ups work better for them. If one of these setups works better for you, I encourage you to use it (and tell me about how it went)!
One of the primary advantages of using notebooks (Jupyter and RMarkdown) when writing code is that we can mix code with prose (text). That is, we can put the code side-by-side with the analysis, combining both analytics with insight.
Markdown offers a lightweight language for formatting text. The following cheatsheet provides a useful guide of various Markdown commands.
Try rendering Markdown code in a Jupyter Notebook and/or RMarkdown (.rmd
).
\(\LaTeX\) (“Lah-Tec”) is a “document preparation system for high-quality typesetting. It is most often used for medium-to-large technical or scientific documents but it can be used for almost any form of publishing.” At it’s best LaTeX allows for complete customization of a document from scratch. In practice, this usually means “typing pretty math equations”. At it’s worst LaTeX is convoluted and fine way to waste one’s time.
We’ll use LaTeX to write and render mathematical formulas. LaTeX plays well with markdown, making it easy to write text, code, and math! This will be useful when we need to be technical.
LaTeX math has it’s own syntax that can be hard to use when first starting out. You’ll get used to it as we see it pop up throughout the program.
Example 1: Math equation inline with some text. Note the $$
single dollar signs.
Result:
This is my model \(y_i = \beta_0 + \beta_1 x_i + \epsilon\).
Example 2: A stand along math equation. Note the double dollar signs.
Result:
\[ pr(y_i = 1) = \frac{1}{1 + e^{\beta_0 + \beta_1 x_i}}\]
We’be provided some resources in this week’s supplemental readings to help you navigate the labyrinth that is LaTeX.
magic
CommandsMagic commands are special commands that can be executed in a Jupyter code cell (so these commands will be relevant if you’re in a Jupyter Notebook or using Atom + Hydrogen). Magic commands are prefixed by the %
character. These magic commands are designed to succinctly solve various common problems in standard data analysis.
Magic commands come in two flavors:
%
prefix and operate on a single line of input,%%
prefix and operate on multiple lines of input.Here is a short Jupyter Notebook that walks through some useful magic commands. In addition, it also provides guidance on Notebook Extensions to add functionality to a Jupyter Notebook.
The following outlines a few common commands that will be useful as you move forward. Disclaimer: some of these commands may differ given your operating system, but it’s only quick Google search to find out how things are done on your machine.
pwd
: check working directorycd <path>
: change working directory
cd ..
: go back to the last directorycd
: go to the top directorycd -
: go back to where you once wherels
: list all files in the working directorymkdir <dir name>
: make a directorymv <old path> <new path>
: move file from old path to new pathcp <old path> <new path>
: copy file from old path to new pathctr + c
: stops current execution.cat <file>
: print the entire filehead
: view the start of a file to some \(N\) number of lines
head -n 3 file
tail
: view the end of a file to some \(N\) number of lines
tail -n 3 file
touch <file name>
echo 'text' > file
mv <old file name> <new file name>
man <command name>
<command name> -h
The following materials were generated for students enrolled in PPOL564. Please do not distribute without permission.
ed769@georgetown.edu | www.ericdunford.com