-
Notifications
You must be signed in to change notification settings - Fork 88
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Many templates have been floating around in the DMOJ community for validation and input handling in checkers. This commit aims to consolidate them. It has two main goals: - Correct. Duh. - Simple. Other templates that circulate, including the ones I have published, are too complex. People naively try and write their own. I am sick and tired of reading over incorrect validators. These templates forgo some principles of good design (such as object-oriented programming) in favour of pure simplicity. They should be simple enough that they are understandable by the broader community, and are not a black box. Hopefully this also dissuades re-writing.
- Loading branch information
1 parent
a75b939
commit 549d173
Showing
68 changed files
with
1,150 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
name: build | ||
on: [push, pull_request] | ||
jobs: | ||
lint: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v3 | ||
- name: Install clang-format 12 | ||
run: | | ||
wget -O clang-format https://github.com/DMOJ/clang-tools-static-binaries/releases/download/master-5ea3d18c/clang-format-12_linux-amd64 | ||
chmod a+x ./clang-format | ||
- name: Run clang-format | ||
run: find sample_files/problem_setting \( -name '*.h' -or -name '*.cpp' -or -name '*.c' \) -print0 | xargs -0 ./clang-format --dry-run -Werror --color | ||
cpp_template_tests: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v3 | ||
- name: Run C++ template tests | ||
run: | | ||
cd sample_files/problem_setting/test | ||
./run_test.sh |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,93 @@ | ||
# C++ Problem Setting Templates - `cpp_psetting_templates` | ||
|
||
There are three C++ input-handling templates provided for aiding problem setters. They are as follows: | ||
|
||
- [Validator Template](https://github.com/DMOJ/docs/blob/master/sample_files/problem_setting/validator.cpp) | ||
- [Identical Checker/Interactor Template](https://github.com/DMOJ/docs/blob/master/sample_files/problem_setting/identical_checker_interactor.cpp) | ||
- [Standard Checker/Interactor Template](https://github.com/DMOJ/docs/blob/master/sample_files/problem_setting/standard_checker_interactor.cpp) | ||
|
||
Examples of their use are as follows: | ||
|
||
- A validator for <https://dmoj.ca/problem/aplusb> is [here](https://github.com/DMOJ/docs/blob/master/sample_files/problem_setting/examples/validator.cpp). | ||
- An identical-style checker for <https://dmoj.ca/problem/seq3> is [here](https://github.com/DMOJ/docs/blob/master/sample_files/problem_setting/examples/identical_checker.cpp). | ||
- A standard-style interactor for <https://dmoj.ca/problem/seed2> is [here](https://github.com/DMOJ/docs/blob/master/sample_files/problem_setting/examples/standard_interactor.cpp). | ||
|
||
## Validator | ||
|
||
This is a template for validating the input data of problems. It aims to be simple and of course, correct. It contains seven functions. The first three are whitespace functions: | ||
|
||
- `void readSpace()` expects a space at the current position in the input, and aborts the program if there is not a space. | ||
- `void readNewLine()` expects a newline at the current position in the input. | ||
- `void readEOF()` expects the input file to end immediately at the current position. | ||
|
||
The remaining four are for actual content: | ||
|
||
- `std::string readToken(char min_char = 0, char max_char = 127)` returns the next token in the input stream. A token is defined as a whitespace-separated string. If the next character in the input is a whitespace character, this function aborts the program. The optional arguments `min_char` and `max_char` can be used to enforce a range on the characters in the token. For instance, `readToken('a', 'z')` reads a lowercase string of english letters. | ||
- `std::string readLine(char min_char = 0, char max_char = 127)` returns the next line in the input stream. Specifically, it reads until it encounters a `\n`, and discards it (the newline is not part of the returned string). `min_char` and `max_char` are the same as for `readToken`. If `readLine` encounters an EOF, it fails. | ||
- `long long readInt(long long lo, long long hi)` calls `readToken()` and parses the token as an integer. It aborts on overflow, malformed integers, and if the resultant integer is not in the range [lo, hi], inclusive. Leading zeroes and `-0` are not accepted. Please note that reading `unsigned long long` types will not work properly with this function, as it returns a `long long`. | ||
- `long double readFloat(long double lo, long double hi, long double eps = 1e-9)` calls `readToken()` and parses the token as a float. It aborts on overflow, malformed floats, and if the resultant float is not in the range [lo, hi], inclusive, using the provided epsilon to perform the comparison. Scientific notation and NaNs are not accepted, nor are leading zeroes. `-0` is allowed. Trailing zeroes in the decimal portion are also permitted. | ||
- `std::vector<T> readIntArray(size_t N, long long lo, long long hi)` parses the next space-separated N integers into an array, and then reads a final newline. It must be given a template argument, which is the type of the array elements. For example, `readIntArray<int>(5, 1, 10)` reads five space-separated integers into a `std::vector<int>`, where each integer is in the range [1, 10], inclusive. Because this method uses `readInt` internally, it does not handle `unsigned long long` properly. | ||
|
||
A small caveat: `readToken` and `readLine` will throw if the string exceeds 10 million characters. | ||
|
||
`readFloat()` and `readLine()` will likely be of no use for many validators, and can be safely deleted. Similarly, `readIntArray` can be deleted if unneeded. | ||
|
||
## Checkers/Interactors | ||
|
||
The next pair of templates are for checkers/interactors. The difference is the type of whitespace handling: the identical checker/interactor expects whitespace to match exactly. The standard checker/interactor handles whitespace like the `standard` checker. | ||
|
||
The checkers and interactors are designed for the `coci` bridged checker/interactor type. However, updating the codes used and the order of command line parameters to work with other types should not be challenging. | ||
|
||
Both files can be used for either checkers/interactors, with the following caveat: interactors MUST close `stdout` BEFORE calling `readEOF()`, so that the user process can terminate in case it _also_ expects an EOF. Checker stdout is used for feedback displayed to the user, and as such `stdout` should not be closed in this case. Validators also do not need to worry about this - only interactors do, and they should only call `readEOF()` once they have finished communicating with the user, to clean up and assert that the user didn't send any trailing data. | ||
|
||
### Identical Checker/Interactor | ||
|
||
This template expects whitespace to match exactly, just as in the validator. The template is simpler, but it is less forgiving to contestants. | ||
|
||
The same functions are provided, but have slightly different behaviour: | ||
|
||
- `readSpace(), readNewLine(), readEOF()`: These return Presentation Error if the check fails. | ||
- `readToken()`: This exits with a Presentation Error if the token is empty, and WA if any character is not in range. | ||
- `readLine()`: This exits with a Presentation Error if an EOF is encountered, and WA if any character is not in range. | ||
- `readInt(), readIntArray(), readFloat()`: These exit with WA on failure. | ||
|
||
Four new functions are provided: | ||
|
||
- `exitWA(), exitPE()`: These functions exit immediately with the specified code. | ||
- `assertWA(bool), assertPE(bool)`: These functions exit if the provided condition is false. Useful as a replacement for `assert()`. | ||
|
||
One new namespace is provided. The `CheckerCodes` namespace contains the constants `AC, WA, PE`, and `PARTIAL`. It is recommended to use them in `main` to return a verdict. For instance, to return an AC verdict, use `return CheckerCodes::AC;` | ||
|
||
Finally, there is an empty function `errorHook()`, which is called whenever any of the functions in the API would exit with an error. This can be used to implement functionality such as partial points, or outputting `-1` to signal errors in interactors. | ||
|
||
### Standard Checker/Interactor | ||
|
||
This template is much more complex, but is more lenient for submissions. It matches the whitespace of the `standard` builtin checker. | ||
|
||
The behaviour of the non-whitespace functions are the same as for the identical checker template, with the following caveats: | ||
|
||
- `exitPE()` and `assertPE()` don't exist, since the builtin `standard` checker never uses the Presentation Error code. Checker writers are discouraged from using it. | ||
- `readToken()` always exits with WA on failure. | ||
- `readLine()` doesn't exist, since the way it should process whitespace is not clear for the standard checker. Checker writers reaching for this method should consider the identical checker template instead, or rethink their output format entirely. | ||
- `CheckerCodes` doesn't contain the constant for `PE`. | ||
|
||
The remaining functions have the same behaviour as with the identical checker. | ||
|
||
#### Whitespace functions | ||
|
||
These functions all exit with `WA` on failure instead of `PE`, for the reasons described above. | ||
|
||
The code maintains a flag of the type of whitespace it expects, one of `NONE, SPACE, NEWLINE, ALL`, initially ALL. `readSpace()` sets the flag to `SPACE` and `readNewLine()` sets it to `NEWLINE`. | ||
|
||
`readToken()` sets the flag to `NONE` and consumes all the whitespace, and exits with WA if either: | ||
|
||
- The current flag is `SPACE` and a newline was found. | ||
- The current flag is `NEWLINE` and no newline was found. | ||
|
||
It never exits with WA if the flag is `ALL`. It causes an IE if the flag is `NONE`, which only happens when `readToken()` is called twice in a row without an intervening whitespace function. Note that `readInt()` and `readFloat()` call `readToken()` internally. | ||
|
||
`readEOF()` sets the flag to `ALL`, consumes all whitespace, and then exits with WA if any character remains in the stream. | ||
|
||
No two whitespace functions can be called back to back, except for `readNewLine()` followed by `readEOF()`. The reason for the exception is that the canonical form for output should have a trailing newline, and so this exception allows checkers writers to think in terms of the canonical form of the output. Also, it allows calling `readIntArray()` right before `readEOF()`, since `readIntArray()` internally calls `readNewLine()`. | ||
|
||
Note that this scheme is lazy. This is intentional; it allows the same code to be used by interactors without difficulty. |
147 changes: 147 additions & 0 deletions
147
sample_files/problem_setting/examples/identical_checker.cpp
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,147 @@ | ||
#include <algorithm> | ||
#include <cstdio> | ||
#include <cstdlib> | ||
#include <fstream> | ||
#include <iostream> | ||
#include <numeric> | ||
#include <regex.h> | ||
#include <stdexcept> | ||
#include <string> | ||
#include <vector> | ||
|
||
namespace CheckerCodes { | ||
constexpr int AC = 0; | ||
constexpr int WA = 1; | ||
constexpr int PE = 2; | ||
constexpr int PARTIAL = 7; | ||
} // namespace CheckerCodes | ||
|
||
namespace regex_helpers { | ||
|
||
regex_t compile(const char *pattern) { | ||
regex_t re; | ||
if (regcomp(&re, pattern, REG_EXTENDED | REG_NOSUB) != 0) { | ||
throw std::runtime_error("Pattern failed to compile."); | ||
} | ||
return re; | ||
} | ||
|
||
bool match(regex_t re, const std::string &text) { | ||
return regexec(&re, text.c_str(), 0, NULL, 0) == 0; | ||
} | ||
|
||
} // namespace regex_helpers | ||
|
||
void errorHook(); | ||
|
||
void exitWA() { | ||
errorHook(); | ||
std::exit(CheckerCodes::WA); | ||
} | ||
|
||
void exitPE() { | ||
errorHook(); | ||
std::exit(CheckerCodes::PE); | ||
} | ||
|
||
void assertWA(bool condition) { | ||
if (!condition) { | ||
exitWA(); | ||
} | ||
} | ||
|
||
void assertPE(bool condition) { | ||
if (!condition) { | ||
exitPE(); | ||
} | ||
} | ||
|
||
void readSpace() { assertPE(std::cin.get() == ' '); } | ||
void readNewLine() { assertPE(std::cin.get() == '\n'); } | ||
void readEOF() { assertPE(std::cin.get() == EOF); } | ||
|
||
std::string readToken(char min_char = 0, char max_char = 127) { | ||
static constexpr size_t MAX_TOKEN_SIZE = 1e7; | ||
std::string token; | ||
int c = std::cin.get(); | ||
assertPE(!isspace(c)); | ||
while (!isspace(c) && c != EOF) { | ||
assertWA(token.size() < MAX_TOKEN_SIZE); | ||
assertWA(min_char <= c && c <= max_char); | ||
token.push_back(char(c)); | ||
c = std::cin.get(); | ||
} | ||
std::cin.unget(); | ||
return token; | ||
} | ||
|
||
long long readInt(long long lo, long long hi) { | ||
static regex_t re = regex_helpers::compile("^(0|-?[1-9][0-9]*)$"); | ||
std::string token = readToken(); | ||
assertWA(regex_helpers::match(re, token)); | ||
|
||
long long parsedInt; | ||
try { | ||
parsedInt = stoll(token); | ||
} catch (const std::invalid_argument &) { | ||
exitWA(); | ||
} catch (const std::out_of_range &) { | ||
exitWA(); | ||
} | ||
assertWA(lo <= parsedInt && parsedInt <= hi); | ||
return parsedInt; | ||
} | ||
|
||
template <typename T> | ||
std::vector<T> readIntArray(size_t N, long long lo, long long hi) { | ||
std::vector<T> arr; | ||
arr.reserve(N); | ||
for (size_t i = 0; i < N; i++) { | ||
if (i) { | ||
readSpace(); | ||
} | ||
arr.push_back(readInt(lo, hi)); | ||
} | ||
readNewLine(); | ||
return arr; | ||
} | ||
|
||
void errorHook() {} | ||
|
||
// readLine() and readFloat() removed for brevity. | ||
|
||
int main(int argc, char **argv) { | ||
std::ifstream judge_input(argv[1]); | ||
std::ifstream submission_output(argv[2]); | ||
std::cin.rdbuf(submission_output.rdbuf()); | ||
std::ifstream judge_answer(argv[3]); | ||
|
||
int N, K; | ||
judge_input >> N >> K; | ||
|
||
// If any integer is greater than K, we give an immediate WA. | ||
std::vector<int> arr = readIntArray<int>(N, 0, K); | ||
// We can read EOF now, since we are done with the input. This makes it easier | ||
// to remember. | ||
readEOF(); | ||
|
||
// Note that we must store the sum in a long long, since it may overflow a | ||
// 32-bit integer. | ||
long long sum = std::accumulate(arr.begin(), arr.end(), 0LL); | ||
|
||
assertWA(sum == K); | ||
|
||
// It turns out that the minimum product is always zero, since [0, 0, ..., K] | ||
// is always valid. | ||
// Thus, it suffices to check for a zero in the array. | ||
if (std::find(arr.begin(), arr.end(), 0) == arr.end()) { | ||
// No zero found. Give partial points: | ||
// Output to stderr for the coci contrib module to grant partial AC. | ||
std::cerr << "partial 50/100" << std::endl; | ||
// Output to stdout to give user feedback. Newline appears as space on | ||
// judge, so omit it. | ||
std::cout << "50/100 points" << std::flush; | ||
return CheckerCodes::PARTIAL; | ||
} | ||
return CheckerCodes::AC; | ||
} |
Oops, something went wrong.