Corpus
Manage conlang texts and Leipzig-standard interlinear glossing in a sidebar + editor panel layout.
Navigate: click Corpus in the sidebar.
Text Management
- Create/edit/delete corpus texts
- Metadata: title, description, tags, timestamps
- Raw text (in your conlang) and free translation
Interlinear Glossing
Leipzig glossing standard annotations:
- Auto-gloss: Raw text → split by sentences → split by spaces → auto-match from lexicon
- Manual editing: Each token can be edited independently
- Surface form
- Morpheme break (use
-to separate morphemes) - Gloss labels
- Linked entry ID
- IPA
Auto-gloss engine (advanced)
The auto-gloss pipeline supports:
- exact lexicon match
- concatenative reverse parsing (multi-prefix + stem + multi-suffix)
- non-concatenative candidates (infix/circumfix/reduplication/ablaut)
- confidence-based auto-apply + pending review queue
Per-corpus report persistence
Auto-gloss report is stored per corpus item:
- total tokens
- auto-applied
- pending review
- unresolved
Switching to another corpus no longer reuses a previous report.
Steps
- Click "New" → enter title and description
- Write a passage in your language in Raw Text
- Write the natural language translation in Free Translation
- Click "Auto-gloss" — the app tries to match each word from the lexicon
- Manually edit unmatched tokens
Step-by-step: Review pending suggestions
- Run
Auto-gloss. - Open the pending suggestions panel.
- Check confidence and trace columns.
- Apply one-by-one or click
Apply All. - Re-check unresolved token count in report.
Apply SCA to corpus with diff (advanced)
Use Preview & Apply SCA to Corpus to evolve corpus text directly.
What diff includes
- scope: original text / gloss-line original / token surface
- before -> after values
- context snippet
- checkbox selection (default all selected)
Step-by-step: Safe corpus evolution
- Click
Preview & Apply SCA to Corpus. - Review the diff table.
- Deselect changes you do not want.
- Click
Apply Selected Changes. - The system automatically runs auto-gloss again on updated corpus.
Close behavior
Use the red circular close icon at the top-right corner of the panel to dismiss preview anytime.
Lexicon Integration
After linking a token to a lexicon entry, hovering shows entry details (POS, definition, IPA).
Advanced example: diachronic corpus pass
Scenario: you maintain a proto text and evolve it for a daughter language.
- Prepare baseline corpus in parent language.
- Configure SCA in daughter language.
- Run corpus SCA diff preview and apply selected changes.
- Let auto-gloss rerun automatically.
- Resolve remaining low-confidence items.