From 63119ea792d1c3f9b8a389db4498495c473338e1 Mon Sep 17 00:00:00 2001 From: robinlovelace Date: Wed, 22 Jan 2025 11:28:14 +0000 Subject: [PATCH] Update landing page --- index.qmd | 85 +++++++++++++++++++++++++++++-------------------------- 1 file changed, 45 insertions(+), 40 deletions(-) diff --git a/index.qmd b/index.qmd index b76201f..c45ec70 100644 --- a/index.qmd +++ b/index.qmd @@ -4,46 +4,33 @@ number-sections: false --- A module on using data science to solve transport problems. +This couse is based at the University of Leeds' Institute for Transport Studies. +It has evolved over a decade of teaching and research in the field and aims to teach you up-to-date and future-proof skills with practical examples and reproducible workflows using industry-standard data science tools. -# R, Python or other? - -While the focus of the module is on methods that can be implemented in many languages, we expect most people taking this course will use R for the practicals and for the end our course project. -We use R because it is hard to beat in terms of a data science environment *with batteries included*, with many mature packages for data manipulation, visualisation, and statistical analysis available within seconds, without having to worry about package conflicts or managing environments. -R is also the language with which the module team has the most experience. - -Python is another excellent choice for transport data science, and many of the example code chunks we provide in R have been ported to Python examples, as illustrated below, which shows how to load the R package `{sf}` and the equivalent Python package `{geopandas}`. - -::: {.panel-tabset} - -## R +# Prerequisites -```r -library(sf) -geo_data = read_sf("geo_data.gpkg") -``` +We expect you to have some experience with programming, data science and computing in general before taking this course. +Experience with geographic data is not required, but is helpful. -## Python +## Hardware -```python -import geopandas as gpd -geo_data = gpd.read_file("geo_data.gpkg") -``` +Access to a computer that you have permission to install software on, with at least 8 GB of RAM, is highly recommended. +You could use a cloud-based service such as RStudio Cloud, Google Colab, or GitHub Codespaces, but you would need to be comfortable with using these services and would miss out on some of the benefits of using your own computer. -::: +## Command-line experience -, but the support for Python should be considered work in progress. -If you do choose to use Python, you will be expected to manage your own Python environment, and to be able to translate the R code examples into Python. +You should be comfortable with computing in general, for example creating folders, moving files, and installing software. +You should be comfortable with using command line interfaces such as PowerShell in Windows, Terminal in macOS, or the Linux shell. -If you are feeling adventurous, you could try using Julia, JavaScript/TypeScript (e.g. via Observable) or another language, but you will be on your own in terms of support. +## Data science experience prerequisites -# Prerequisites +Prior experience of using R or Python (e.g. having used it for work, in previous degrees or having completed an online course) is essential. -## Hardware +Students can demonstrate this by showing evidence that they have worked with R before, have completed an online course such as the first 4 sessions in the [RStudio Primers series](https://rstudio.cloud/learn/primers) or [DataCamp’s Free Introduction to R course](https://www.datacamp.com/courses/free-introduction-to-r). -Access to a computer that you have permission to install software on, with at least 8 GB of RAM, is highly recommended. -You could use a cloud-based service such as RStudio Cloud, Google Colab, or GitHub Codespaces, but you would need to be comfortable with using these services and would miss out on some of the benefits of using your own computer. +Evidence of substantial programming and data science experience in previous professional or academic work, in languages such as R or Python, also constitutes sufficient pre-requisite knowledge for the course. -## Software +# Software requirements and installation You need to install some software to take the course. @@ -52,7 +39,7 @@ We recommend that most people use R for the practical sessions and the coursewor If you do choose to use Python or another language, you will have less support: you will need the skills to set-up and manage your own environments. Translations of the R code examples into your chosen language will be very welcome, and contributions to the course materials via [Pull Requests on GitHub](https://github.com/itsleeds/tds/pulls) are encouraged. -### Quickstart with GitHub Codespaces +## Quickstart with GitHub Codespaces You can use GitHub Codespaces to get started with the course materials in a cloud-based environment. Sign-up to GitHub, fork the repository, and click the "Open in GitHub Codespaces" button above to get started. @@ -61,7 +48,7 @@ You can also use the following link: [![Open in GitHub Codespaces](https://img.shields.io/badge/Open%20in-GitHub%20Codespaces-blue?logo=github.png)](https://github.com/codespaces/new/itsleeds/tds?quickstart=1) -### R +## R Install a recent version of R (4.3.0 or above) and RStudio (recommended) or another IDE such as VS Code (if you have prior experience with it) with the the following links: @@ -84,12 +71,12 @@ Chapter 2 of Geocomputation with R (the Prerequisites section contains links for [project management section](https://csgillespie.github.io/efficientR/set-up.html#project-management). ] -### Python +## Python If you choose to use Python, you should be able to install it and manage your own Python environment, including installing packages and dealing with package conflicts. If you use Python we recommend using an environment manager such as `pixi` (which can manage both R and Python environments) or Docker (best practice for reproducibility and isolation). -### Docker (advanced) +## Docker (advanced) We maintain a Docker image that contains all the software you need to complete the course with VS Code, quarto and a Devcontainer set-up. Advantages of this approach include that it ensures reproducibility and can save time installing software. @@ -97,18 +84,36 @@ Disadvantages include that it can be hard to install Docker and can be difficult We therefore recommend this approach only for people who are confident with Docker and willing to invest time in learning how to use it. See [the Docker installation instructions](https://docs.docker.com/get-docker/), the [devcontainers documentation on github.com](https://github.com/devcontainers) and the tds [Dockerfile](https://github.com/itsleeds/tds/blob/main/Dockerfile) and [devcontainer.json](https://github.com/itsleeds/tds/blob/main/.devcontainer/devcontainer.json) for guidance on getting started with Docker. -## Command-line experience +# R, Python or other? -You should be comfortable with computing in general, for example creating folders, moving files, and installing software. -You should be comfortable with using command line interfaces such as PowerShell in Windows, Terminal in macOS, or the Linux shell. +While the focus of the module is on methods that can be implemented in many languages, we expect most people taking this course will use R for the practicals and for the end our course project. +We use R because it is hard to beat in terms of a data science environment *with batteries included*, with many mature packages for data manipulation, visualisation, and statistical analysis available within seconds, without having to worry about package conflicts or managing environments. +R is also the language with which the module team has the most experience. -## Data science experience prerequisites +Python is another excellent choice for transport data science, and many of the example code chunks we provide in R have been ported to Python examples, as illustrated below, which shows how to load the R package `{sf}` and the equivalent Python package `{geopandas}`. -Prior experience of using R or Python (e.g. having used it for work, in previous degrees or having completed an online course) is essential. +::: {.panel-tabset} -Students can demonstrate this by showing evidence that they have worked with R before, have completed an online course such as the first 4 sessions in the [RStudio Primers series](https://rstudio.cloud/learn/primers) or [DataCamp’s Free Introduction to R course](https://www.datacamp.com/courses/free-introduction-to-r). +## R -Evidence of substantial programming and data science experience in previous professional or academic work, in languages such as R or Python, also constitutes sufficient pre-requisite knowledge for the course. +```r +library(sf) +geo_data = read_sf("geo_data.gpkg") +``` + +## Python + +```python +import geopandas as gpd +geo_data = gpd.read_file("geo_data.gpkg") +``` + +::: + +, but the support for Python should be considered work in progress. +If you do choose to use Python, you will be expected to manage your own Python environment, and to be able to translate the R code examples into Python. + +If you are feeling adventurous, you could try using Julia, JavaScript/TypeScript (e.g. via Observable) or another language, but you will be on your own in terms of support. # Contributing to the course