This repo contains the frontend for Korp, Språkbanken's word research platform using the IMS Open Corpus Workbench (CWB). Korp is a great tool for searching and visualising natural language corpus data.
Korp is mainly developed by Språkbanken at the University of Gothenburg, Sweden. Contributions are also made from other organizations that use the software.
Documentation:
- Frontend documentation
- Backend documentation
- Sparv - The pipeline used to tag and otherwise process raw Swedish-language corpus data is documented here
- Språkbanken's Korp configuration directory (supplement to documentation)
Install yarn
: https://yarnpkg.com
- install all dependencies:
yarn
- run development server:
yarn start
- build a dist-version:
yarn build
Declare dependencies using yarn add pkg
or yarn add --dev pkg
for dev dependencies.
npm
has not worked previously, but the status is unknown right now.
We use webpack to build Korp and webpack-dev-server to run a local server. To include new code or resources, require or use import them where needed:
import { aFunction } from 'new-dependency'
or
nd = require("new-dependency")
nd.aFunction()
or
imgPath = require("img/image.png")
myTemplate = `<img src='${imgPath}'>`
Some dependencies are only specified in app/index.ts
.
About the current loaders in webpack.config.js
:
pug
andhtml
files: allsrc
-attributes in<img>
tags and allhref
s in<link>
tags will be loaded by webpack and replaced in the markup. Uses file loader so that requiring apug
orhtml
file will give the path to the file back.js
files are added to the bundle- all images and fonts are added to the bundle using file loader and gives back a file path.
css
andscss
are added to the bundle.url
s will be loaded and replaced by webpack.
In addition to this, some specific files will simply be copied as is, for example Korp mode-files.
Use config.yml
for settings needed in the frontend. In some cases, mode-files can be used. For example
it is possible to have different backends for modes.
There are several instances of Korp, here are a list of some:
- Språkbanken Text
- The Language Bank of Finland (Kielipankki)
- Iceland / Stofnun Árna Magnússonar í íslenskum fræðum
- Tromsø / Giellatekno
- Copenhagen / Institut for Nordiske Studier og Sprogvidenskab
When developing, the frontend is served at http://localhost:9111 by default.
Host and port can be changed by the environment variables:
KORP_HOST=<host>
KORP_PORT=<port>
Environment variables can be entered in the .env
file, which is git-ignored.
It is also possible to serve the frontend from HTTPS using the environment variables:
KORP_HTTPS=true
KORP_KEY=<path_to_key>-key.pem
KORP_CERT=<path to cert>.pem
The key and cert can be created using mkcert.
mkcert korp.spraakbanken.gu.se
mkcert -install
Now use korp.spraakbanken.gu.se
as the value for KORP_HOST
. It must also be added
to /etc/hosts
.
Development is done on the dev
branch. These changes are not necessarily yet stable and well-tested.
Once tested, they can be merged to the master
branch in a release.
When doing a release:
- Update version in
package.json
to the next version - Add relevent changes to
CHANGELOG.md
- Check that the user manual and development documentation is up to date
- Merge
dev
tomaster
(using--no-ff
) - Tag the merge commit with the new version (prefixed with
v
, see the other tag names)
As an external developer, when forking this respository, you may choose to pull from dev
and/or master
, depending on your needs for latest versus stable changes.