Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Newline character as list delimiter? #12

Open
supergeoff opened this issue Dec 9, 2019 · 5 comments
Open

Newline character as list delimiter? #12

supergeoff opened this issue Dec 9, 2019 · 5 comments

Comments

@supergeoff
Copy link

Cant figure out how to parse a line by line file (CSV, or other list...)

from pyleri import Choice, Grammar, Keyword, List, Optional, Ref, Regex, Sequence, Token
import os


class MyGrammar(Grammar):
	element = Regex(r"\w+")
	START = List(element, "\n", opt=True)


grammar = MyGrammar()
result = grammar.parse("word\nword\n").as_str()

print(result)

error at position 5, expecting:

Tried with combination of \n, \n, os.linesep...

No luck

Any idea?

@joente
Copy link
Member

joente commented Dec 9, 2019

Maybe your use-case can be solved with Repeat instead of a List:

from pyleri import Repeat, Grammar, Regex

class MyGrammar(Grammar):
	element = Regex(r"\w+")
	START = Repeat(element)

grammar = MyGrammar()
print(grammar.parse("word\nword\n").as_str())  # parsed successfully

However, if it get's more complicated you might run into to problems since pyleri handles all white-space (including line breaks) equally. This could be solved rather easy though, for example we could allow a user to define the white-space used in a grammar by RE_WHITESPACE, like we do with RE_KEYWORDS.

@supergeoff
Copy link
Author

supergeoff commented Dec 9, 2019

As I understand list does work for delimiter that are anything non whitespace (including line breaks)?
Allowing the definition of white-space with someting like RE_KEYWORDS would be a nice feature indeed.

Thank you for your quick reply.

Edit: I can see a lot of example where I have lines with white space that are only separated by linebreaks...

@wqp89324
Copy link

wqp89324 commented Feb 5, 2021

Would the RE_WHITESPACE feature be available soon? Thanks!

@wqp89324
Copy link

@joente you mentioned "pyleri handles all white-space (including line breaks) equally", what does it mean exactly? Are all white-space (including line breaks) equally ignored?

@joente
Copy link
Member

joente commented Feb 11, 2021

@wqp89324, pyleri is currently using strip without any arguments for handling white-space between nodes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants