You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The JISC newspapers dataset, after pre-processing with the jisc-wrangler tool, exhibit a slightly different directory structure to that assumed by alto2txt.
Instead of a single newspaper issue subdirectory of the form mmdd/ the JISC structure separates the month & day into separate subdirectories: mm/dd/.
The JISC newspapers dataset, after pre-processing with the jisc-wrangler tool, exhibit a slightly different directory structure to that assumed by alto2txt.
Instead of a single newspaper issue subdirectory of the form
mmdd/
the JISC structure separates the month & day into separate subdirectories:mm/dd/
.This results in warnings like:
and the XML files in the subdirectory are not processed.
The task is to implement a simple workaround to accommodate the non-standard JISC directory structure.
The text was updated successfully, but these errors were encountered: