Skip to content

Commit

Permalink
Update landing page
Browse files Browse the repository at this point in the history
  • Loading branch information
Robinlovelace committed Jan 22, 2025
1 parent b235358 commit 63119ea
Showing 1 changed file with 45 additions and 40 deletions.
85 changes: 45 additions & 40 deletions index.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -4,46 +4,33 @@ number-sections: false
---

A module on using data science to solve transport problems.
This couse is based at the University of Leeds' Institute for Transport Studies.
It has evolved over a decade of teaching and research in the field and aims to teach you up-to-date and future-proof skills with practical examples and reproducible workflows using industry-standard data science tools.

# R, Python or other?

While the focus of the module is on methods that can be implemented in many languages, we expect most people taking this course will use R for the practicals and for the end our course project.
We use R because it is hard to beat in terms of a data science environment *with batteries included*, with many mature packages for data manipulation, visualisation, and statistical analysis available within seconds, without having to worry about package conflicts or managing environments.
R is also the language with which the module team has the most experience.

Python is another excellent choice for transport data science, and many of the example code chunks we provide in R have been ported to Python examples, as illustrated below, which shows how to load the R package `{sf}` and the equivalent Python package `{geopandas}`.

::: {.panel-tabset}

## R
# Prerequisites

```r
library(sf)
geo_data = read_sf("geo_data.gpkg")
```
We expect you to have some experience with programming, data science and computing in general before taking this course.
Experience with geographic data is not required, but is helpful.

## Python
## Hardware

```python
import geopandas as gpd
geo_data = gpd.read_file("geo_data.gpkg")
```
Access to a computer that you have permission to install software on, with at least 8 GB of RAM, is highly recommended.
You could use a cloud-based service such as RStudio Cloud, Google Colab, or GitHub Codespaces, but you would need to be comfortable with using these services and would miss out on some of the benefits of using your own computer.

:::
## Command-line experience

, but the support for Python should be considered work in progress.
If you do choose to use Python, you will be expected to manage your own Python environment, and to be able to translate the R code examples into Python.
You should be comfortable with computing in general, for example creating folders, moving files, and installing software.
You should be comfortable with using command line interfaces such as PowerShell in Windows, Terminal in macOS, or the Linux shell.

If you are feeling adventurous, you could try using Julia, JavaScript/TypeScript (e.g. via Observable) or another language, but you will be on your own in terms of support.
## Data science experience prerequisites

# Prerequisites
Prior experience of using R or Python (e.g. having used it for work, in previous degrees or having completed an online course) is essential.

## Hardware
Students can demonstrate this by showing evidence that they have worked with R before, have completed an online course such as the first 4 sessions in the [RStudio Primers series](https://rstudio.cloud/learn/primers) or [DataCamp’s Free Introduction to R course](https://www.datacamp.com/courses/free-introduction-to-r).

Access to a computer that you have permission to install software on, with at least 8 GB of RAM, is highly recommended.
You could use a cloud-based service such as RStudio Cloud, Google Colab, or GitHub Codespaces, but you would need to be comfortable with using these services and would miss out on some of the benefits of using your own computer.
Evidence of substantial programming and data science experience in previous professional or academic work, in languages such as R or Python, also constitutes sufficient pre-requisite knowledge for the course.

## Software
# Software requirements and installation

You need to install some software to take the course.

Expand All @@ -52,7 +39,7 @@ We recommend that most people use R for the practical sessions and the coursewor
If you do choose to use Python or another language, you will have less support: you will need the skills to set-up and manage your own environments.
Translations of the R code examples into your chosen language will be very welcome, and contributions to the course materials via [Pull Requests on GitHub](https://github.com/itsleeds/tds/pulls) are encouraged.

### Quickstart with GitHub Codespaces
## Quickstart with GitHub Codespaces

You can use GitHub Codespaces to get started with the course materials in a cloud-based environment.
Sign-up to GitHub, fork the repository, and click the "Open in GitHub Codespaces" button above to get started.
Expand All @@ -61,7 +48,7 @@ You can also use the following link:
[![Open in GitHub
Codespaces](https://img.shields.io/badge/Open%20in-GitHub%20Codespaces-blue?logo=github.png)](https://github.com/codespaces/new/itsleeds/tds?quickstart=1)

### R
## R

Install a recent version of R (4.3.0 or above) and RStudio (recommended) or another IDE such as VS Code (if you have prior experience with it) with the the following links:

Expand All @@ -84,31 +71,49 @@ Chapter 2 of Geocomputation with R (the Prerequisites section contains links for
[project management section](https://csgillespie.github.io/efficientR/set-up.html#project-management).
]

### Python
## Python

If you choose to use Python, you should be able to install it and manage your own Python environment, including installing packages and dealing with package conflicts.
If you use Python we recommend using an environment manager such as `pixi` (which can manage both R and Python environments) or Docker (best practice for reproducibility and isolation).

### Docker (advanced)
## Docker (advanced)

We maintain a Docker image that contains all the software you need to complete the course with VS Code, quarto and a Devcontainer set-up.
Advantages of this approach include that it ensures reproducibility and can save time installing software.
Disadvantages include that it can be hard to install Docker and can be difficult to use if you are not familiar with Docker.
We therefore recommend this approach only for people who are confident with Docker and willing to invest time in learning how to use it.
See [the Docker installation instructions](https://docs.docker.com/get-docker/), the [devcontainers documentation on github.com](https://github.com/devcontainers) and the tds [Dockerfile](https://github.com/itsleeds/tds/blob/main/Dockerfile) and [devcontainer.json](https://github.com/itsleeds/tds/blob/main/.devcontainer/devcontainer.json) for guidance on getting started with Docker.

## Command-line experience
# R, Python or other?

You should be comfortable with computing in general, for example creating folders, moving files, and installing software.
You should be comfortable with using command line interfaces such as PowerShell in Windows, Terminal in macOS, or the Linux shell.
While the focus of the module is on methods that can be implemented in many languages, we expect most people taking this course will use R for the practicals and for the end our course project.
We use R because it is hard to beat in terms of a data science environment *with batteries included*, with many mature packages for data manipulation, visualisation, and statistical analysis available within seconds, without having to worry about package conflicts or managing environments.
R is also the language with which the module team has the most experience.

## Data science experience prerequisites
Python is another excellent choice for transport data science, and many of the example code chunks we provide in R have been ported to Python examples, as illustrated below, which shows how to load the R package `{sf}` and the equivalent Python package `{geopandas}`.

Prior experience of using R or Python (e.g. having used it for work, in previous degrees or having completed an online course) is essential.
::: {.panel-tabset}

Students can demonstrate this by showing evidence that they have worked with R before, have completed an online course such as the first 4 sessions in the [RStudio Primers series](https://rstudio.cloud/learn/primers) or [DataCamp’s Free Introduction to R course](https://www.datacamp.com/courses/free-introduction-to-r).
## R

Evidence of substantial programming and data science experience in previous professional or academic work, in languages such as R or Python, also constitutes sufficient pre-requisite knowledge for the course.
```r
library(sf)
geo_data = read_sf("geo_data.gpkg")
```

## Python

```python
import geopandas as gpd
geo_data = gpd.read_file("geo_data.gpkg")
```

:::

, but the support for Python should be considered work in progress.
If you do choose to use Python, you will be expected to manage your own Python environment, and to be able to translate the R code examples into Python.

If you are feeling adventurous, you could try using Julia, JavaScript/TypeScript (e.g. via Observable) or another language, but you will be on your own in terms of support.

# Contributing to the course

Expand Down

0 comments on commit 63119ea

Please sign in to comment.