Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move some files around #38

Merged
merged 4 commits into from
Jan 20, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
137 changes: 73 additions & 64 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,20 @@ The Message and Resource representations are drawn from work done for the
Unicode [MessageFormat 2 specification](https://github.com/unicode-org/message-format-wg/tree/main/spec)
and the [Message resource specification](https://github.com/eemeli/message-resource-wg/).

Support for XML formats (`android`, `xliff`) is an optional extra;
The library currently supports the following resource formats:

- `android`†: Android string resources (strings.xml)
- `dtd`: .dtd
- `fluent`: Fluent (.ftl)
- `inc`: .inc
- `ini`: .ini
- `plain_json`: Plain JSON (.json)
- `po`: Gettext (.po, .pot)
- `properties`: .properties
- `webext`: WebExtensions (messages.json)
- `xliff`†: XLIFF 1.2, including XCode customizations (.xlf, .xliff)

**†** Support for XML formats (`android`, `xliff`) is an optional extra;
to support them, install as `moz.l10n[xml]`.

## Command-line Tools
Expand Down Expand Up @@ -48,9 +61,33 @@ Fix the formatting for localization resources.
If `paths` is a single directory, it is iterated with `L10nConfigPaths` if `--config` is set, or `L10nDiscoverPaths` otherwise.
If `paths` is not a single directory, its values are treated as glob expressions, with `**` support.

## moz.l10n.paths
## Python API

### moz.l10n.formats.FORMAT

### L10nConfigPaths
Parsers and serializers are provided for a number of formats,
using common and well-established libraries to take care of the details.
A unified API for these is provided,
such that `FORMAT_parse(text)` will always accept `str` input,
and `FORMAT_serialize(resource)` will always provide a `str` iterator.
All the serializers accept a `trim_comments` argument
which leaves out comments from the serialized result,
but additional input types and options vary by format.

### moz.l10n.formats.detect_format

```python
from moz.l10n.formats import detect_format

def detect_format(name: str | None, source: bytes | str) -> Format | None
```

Detect the format of the input based on its file extension
and/or contents.

Returns a `Format` enum value, or `None` if the input is not recognized.

### moz.l10n.paths.L10nConfigPaths

Wrapper for localization config files.

Expand All @@ -66,7 +103,7 @@ Differences:

Does not consider `.l10n-ignore` files.

### L10nDiscoverPaths
### moz.l10n.paths.L10nDiscoverPaths

Automagical localization resource discovery.

Expand All @@ -80,33 +117,11 @@ BCP 47 locale identifiers, i.e. like `aa`, `aa-AA`, `aa-Aaaa`, or `aa-Aaaa-AA`.

An underscore may also be used as a separator, as in `en_US`.

## moz.l10n.resources

Parsers and serializers are provided for a number of formats,
using common and well-established libraries to take care of the details.
A unified API for these is provided,
such that `FORMAT_parse(text)` will always accept `str` input,
and `FORMAT_serialize(resource)` will always provide a `str` iterator.
All the serializers accept a `trim_comments` argument
which leaves out comments from the serialized result,
but additional input types and options vary by format.

The library currently supports the following resource formats:

- `android`: Android string resources (strings.xml)
- `dtd`: .dtd
- `fluent`: Fluent (.ftl)
- `inc`: .inc
- `ini`: .ini
- `plain_json`: Plain JSON (.json)
- `po`: Gettext (.po, .pot)
- `properties`: .properties
- `webext`: WebExtensions (messages.json)
- `xliff`: XLIFF 1.2, including XCode customizations (.xlf, .xliff)

### add_entries
### moz.l10n.resource.add_entries

```python
from moz.l10n.resource import add_entries

def add_entries(
target: Resource,
source: Resource,
Expand All @@ -126,42 +141,11 @@ Entries are not copied, so further changes will be reflected in both resources.

Returns a count of added or changed entries and sections.

### detect_format
### moz.l10n.resource.l10n_equal

```python
def detect_format(name: str | None, source: bytes | str) -> Format | None
```
from moz.l10n.resource import l10n_equal

Detect the format of the input based on its file extension
and/or contents.

Returns a `Format` enum value, or `None` if the input is not recognized.

### iter_resources

```python
def iter_resources(
root: str,
dirs: list[str] | None = None,
ignorepath: str = ".l10n-ignore"
) -> Iterator[tuple[str, Resource[Message, str] | None]]
```

Iterate through localizable resources under the `root` directory.
Use `dirs` to limit the search to only some subdirectories under `root`.

Yields `(str, Resource | None)` tuples,
with the file path and the corresponding `Resource`,
or `None` for files that could not be parsed as localization resources.

To ignore files, include a `.l10n-ignore` file in `root`,
or some other location passed in as `ignorepath`.
This file uses a git-ignore syntax,
and is always based in the `root` directory.

### l10n_equal

```python
def l10n_equal(a: Resource, b: Resource) -> bool
```

Expand All @@ -171,9 +155,11 @@ Compares the localization-relevant content
Sections with no message entries are ignored,
and the order of sections, entries, and metadata is ignored.

### parse_resource
### moz.l10n.resource.parse_resource

```python
from moz.l10n.resource import parse_resource

def parse_resource(
input: Format | str | None,
source: str | bytes | None = None
Expand All @@ -191,9 +177,11 @@ If the first argument is a string path,
the `source` argument is optional,
as the file will be opened and read.

### serialize_resource
### moz.l10n.resource.serialize_resource

```python
from moz.l10n.resource import serialize_resource

def serialize_resource(
resource: Resource[str, str] | Resource[Message, str],
format: Format | None = None,
Expand All @@ -207,3 +195,24 @@ If `format` is set, it overrides the `resource.format` value.

With `trim_comments`,
all standalone and attached comments are left out of the serialization.

### moz.l10n.util.walk_files

```python
from moz.l10n.util import walk_files

def walk_files(
root: str,
dirs: list[str] | None = None,
ignorepath: str | None = ".l10n-ignore"
) -> Iterator[str]
```

Iterate through all files under the `root` directory.
Use `dirs` to limit the search to only some subdirectories under `root`.

All files and directories with names starting with `.` are ignored.
To ignore other files, include a `.l10n-ignore` file in `root`,
or some other location passed in as `ignorepath`.
This file uses git-ignore syntax,
and is always based in the `root` directory.
4 changes: 2 additions & 2 deletions moz/l10n/bin/build.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,11 @@
from shutil import copyfile
from textwrap import dedent

from moz.l10n.message import Message
from moz.l10n.formats import Format
from moz.l10n.message.data import Message
from moz.l10n.paths.config import L10nConfigPaths
from moz.l10n.resource import UnsupportedResource, parse_resource, serialize_resource
from moz.l10n.resource.data import Comment, Entry, Resource, Section
from moz.l10n.resource.format import Format

log = logging.getLogger(__name__)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
from os.path import splitext
from typing import Any

# from moz.l10n.resource.xliff.common import xliff_ns
# from moz.l10n.formats.xliff.common import xliff_ns
xliff_ns = {
"urn:oasis:names:tc:xliff:document:1.0",
"urn:oasis:names:tc:xliff:document:1.1",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@

from lxml import etree

from ...message import (
from ...message.data import (
CatchallKey,
Expression,
FunctionAnnotation,
Expand All @@ -29,8 +29,8 @@
SelectMessage,
VariableRef,
)
from ..data import Comment, Entry, Metadata, Resource, Section
from ..format import Format
from ...resource.data import Comment, Entry, Metadata, Resource, Section
from .. import Format

plural_categories = ("zero", "one", "two", "few", "many", "other")
xliff_ns = "urn:oasis:names:tc:xliff:document:1.2"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@

from lxml import etree

from ...message import (
from ...message.data import (
CatchallKey,
Expression,
FunctionAnnotation,
Expand All @@ -30,7 +30,7 @@
SelectMessage,
VariableRef,
)
from ..data import Entry, Metadata, Resource
from ...resource.data import Entry, Metadata, Resource
from .parse import plural_categories, resource_ref, xliff_g, xliff_ns, xml_name


Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,9 @@
from sys import maxsize
from typing import Any

from moz.l10n.message import Message, PatternMessage

from ..data import Comment, Entry, Resource, Section
from ..format import Format
from ...message.data import Message, PatternMessage
from ...resource.data import Comment, Entry, Resource, Section
from .. import Format

name_start_char = (
":A-Z_a-z\xc0-\xd6\xd8-\xf6\xf8-\u02ff"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,8 @@
from re import UNICODE, compile
from typing import Any

from moz.l10n.message import Message, PatternMessage

from ..data import Entry, Resource
from ...message.data import Message, PatternMessage
from ...resource.data import Entry, Resource
from .parse import name, re_comment

re_name = compile(name, UNICODE)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,9 @@
from fluent.syntax import FluentParser
from fluent.syntax import ast as ftl

from ... import message as msg
from .. import data as res
from ..format import Format
from ...message import data as msg
from ...resource import data as res
from .. import Format


@overload
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@
from fluent.syntax import FluentSerializer
from fluent.syntax import ast as ftl

from ... import message as msg
from .. import data as res
from ...message import data as msg
from ...resource import data as res


def fluent_serialize(
Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,9 @@
from re import compile
from typing import Any

from moz.l10n.message import Message, PatternMessage

from ..data import Comment, Entry, Resource, Section
from ..format import Format
from ...message.data import Message, PatternMessage
from ...resource.data import Comment, Entry, Resource, Section
from .. import Format

re_define = compile(r"#define[ \t]+(\w+)(?:[ \t](.*))?")

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,8 @@
from collections.abc import Iterator
from typing import Any

from moz.l10n.message import Message, PatternMessage

from ..data import Entry, Resource
from ...message.data import Message, PatternMessage
from ...resource.data import Entry, Resource


def inc_serialize(
Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,9 @@

from iniparse import ini # type: ignore[import-untyped]

from moz.l10n.message import Message, PatternMessage

from ..data import Comment, Entry, Resource, Section
from ..format import Format
from ...message.data import Message, PatternMessage
from ...resource.data import Comment, Entry, Resource, Section
from .. import Format


def ini_parse(source: TextIO | str | bytes) -> Resource[Message, Any]:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,8 @@
from re import search
from typing import Any

from moz.l10n.message import Message, PatternMessage

from ..data import Entry, Resource
from ...message.data import Message, PatternMessage
from ...resource.data import Entry, Resource


def ini_serialize(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,9 @@
from json import loads
from typing import Any

from moz.l10n.message import Message, PatternMessage

from ..data import Entry, Resource, Section
from ..format import Format
from ...message.data import Message, PatternMessage
from ...resource.data import Entry, Resource, Section
from .. import Format


def plain_json_parse(source: str | bytes) -> Resource[Message, Any]:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,8 @@
from json import dumps
from typing import Any

from moz.l10n.message import Message, PatternMessage

from ..data import Entry, Resource
from ...message.data import Message, PatternMessage
from ...resource.data import Entry, Resource


def plain_json_serialize(
Expand Down
File renamed without changes.
Loading
Loading