Day one : 22 February 2013

Introduction

Two different modes of working:

  • interactive
  • building code

Today is interactive. As before we’ll be using the IPython notebook.

Next week we’ll be building code.

The idea of software carpentry:

  • Learning how to use tools.
  • Homework for this week: editors.

See Choosing an editor for a review of some choices.

The Python environment

  • Python itself comes with many useful modules in the Python standard library
    • os, sys, shutil, glob, tempfile
    • urllib2, HTMLParser, xml, many other web libraries
    • math, decimal (very high precision numbers), random
    • csv, database access with sqlite3 and others
    • tarfile, zipfile, gzip and other compression formats

    Along with debugging, documentation tools, testing tools and much else.

  • Numpy is a Python library defining arrays of data, with many routines for manipulating arrays including basic linear algebra, random number generation and Fourier transforms. Python + Numpy gets you most of what MATLAB can do.

  • Scipy adds a large library of scientific code built on top of Numpy. It is a collection of many different kinds of routines including:

    • scipy.io: Read / write of scientific data formats including MATLAB .mat files
    • scipy.ndimage: processing tools such as smoothing and convolution, resampling for N-D (2D, 3D, 4D etc) arrays.
    • scipy.linalg: expanded linear algebra tools with interfaces to much of the highly optimized LAPACK linear algebra libraries.
    • scipy.optimize: tools for finding optimum values in functions
    • scipy.sparse: working with sparse arrays
    • scipy.interpolate: routines for interpolating data points
    • scipy.stats: routines for statistical distributions, fitting and tests.

    Please see the scipy reference guide for more detail.

  • Matplotlib provides 2D and some 3D plotting, with an interface modeled after MATLAB. See the matplotlib gallery for a taster of the kind of things you can do.

  • Cython: write Python code but with the ability to optimize and compile it down to the C level, often giving very large increases in execution speed.

  • Sympy: a library for symbolic mathematics. It is a computer algebra system like Mathematica or Maple that allows you to manipulate mathematical symbols and functions. It is useful for such things as defining and simplifying and solving equations, finding integrals and differentials.

  • Pandas: high-level fast data analysis using R-like data frames to hold and manipulate data.

  • scikit-learn: an extensive machine learning library, containing algorithms such as independent component analysis, support vector machines among many others.

  • scikit-image: 2D image processing

  • statsmodels: statistical models defined and estimated.

There are many other libraries, including some specific to neuroimaging. We’ll meet one of those next week, nibabel.

The Python libraries

Introduction to python

Continuing on from last week: