Skip to content

Commit

Permalink
update readme for new code org
Browse files Browse the repository at this point in the history
  • Loading branch information
fgregg committed Oct 25, 2024
1 parent a2e5c42 commit df03633
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 22 deletions.
13 changes: 0 additions & 13 deletions Makefile

This file was deleted.

17 changes: 8 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,14 +52,13 @@ probablepeople learns how to parse names/companies through a body of training da
Probablepeople uses [parserator](https://github.com/datamade/parserator), a library for making and improving probabilistic parsers - specifically, parsers that use [python-crfsuite](https://github.com/tpeng/python-crfsuite)'s implementation of conditional random fields. Parserator allows you to train probablepeople's model (a .crfsuite settings file) on labeled training data, and provides tools for easily adding new labeled training data.
#### Building & testing development code

```
git clone https://github.com/datamade/probablepeople.git
cd probablepeople
pip install -r requirements.txt
python setup.py develop
make all
nosetests .
```
```console
git clone https://github.com/datamade/probablepeople.git
cd probablepeople
pip install -e .
pytest
```

#### Creating/adding labeled training data (.xml outfile) from unlabeled raw data (.csv infile)

If there are name/company formats that the parser isn't performing well on, you can add them to training data. As probablepeople continually learns about new cases, it will continually become smarter and more robust.
Expand Down Expand Up @@ -93,7 +92,7 @@ The parserator `label` command will start a console labeling task, where you wil
parserator train name_data/labeled/person_labeled.xml,name_data/labeled/company_labeled.xml probablepeople --modelfile=generic
parserator train name_data/labeled/person_labeled.xml probablepeople --modelfile=person
parserator train name_data/labeled/company_labeled.xml probablepeople --modelfile=company
```
```

## Errors and Bugs

Expand Down

0 comments on commit df03633

Please sign in to comment.