Text-Fabric dataset of the Greek New Testament, based on the Nestle 1904 (7th printing) edition.
About this datasetFeature group | Feature type | Data type | Available for node types | Used by viewtypes |
---|---|---|---|---|
Orthograpic |
Node |
String |
word subphrase phrase |
syntax-view wg-view |
The after
feature includes all material found after a word, such as regular space characters, punctuation marks followed by a space, and text-critical markers. This feature is essential for preserving the context and formatting of the text. This feature is coded in Unicode using polytonic accents over the vowels (oxia, varia, and perispomeni).
This feature is also populated for phrase
or subphrase
, but only if they consist of just one word
node.
All material found after a word. The frequency is provided by the table below.
For word
nodes (used in syntax-view
and wg-view
):
Value | Description | Unicode codepoint | Frequency |
---|---|---|---|
|
Space |   |
119261 |
, |
Comma & space | , &   |
9439 |
. |
Full stop & space | . &   |
5704 |
· |
Midle dot & space | · &   |
2355 |
; |
Semicolon & space | ; &   |
969 |
,— |
Comma, em dash & space | , & — &   |
18 |
— |
Em dash & space | — &   |
7 |
). |
Closing round bracket, full stop & space | ) & . &   |
6 |
.]] |
Full stop & 2 Right Square Bracket | . & 2x ] |
4 |
etc.. | Various | various |
For phrase
nodes (used in syntax-view
):
Value | Description | Unicode codepoint | Frequency |
---|---|---|---|
|
Space |   |
37661 |
, |
Comma & space | , &   |
3892 |
. |
Full stop & space | . &   |
2724 |
· |
Midle dot & space | · &   |
1187 |
; |
Semicolon & space | ; &   |
588 |
,— |
Comma, em dash & space | , & — &   |
8 |
). |
Closing round bracket, full stop & space | ) & . &   |
4 |
etc.. | Various | various |
For subphrase
nodes (used in syntax-view
):
Value | Description | Unicode codepoint | Frequency |
---|---|---|---|
|
Space |   |
119261 |
, |
Comma & space | , &   |
9439 |
. |
Full stop & space | . &   |
5704 |
· |
Midle dot & space | · &   |
2355 |
; |
Semicolon & space | ; &   |
969 |
,— |
Comma, em dash & space | , & — &   |
18 |
— |
Em dash & space | — &   |
7 |
). |
Closing round bracket, full stop & space | ) & . &   |
6 |
.]] |
Full stop & 2 Right Square Bracket | . & 2x ] |
4 |
etc.. | Various | various |
The following image shows the features describing the material found after a word.
The following features describe the full surface text:
The following image shows the relation between these features.
The following text-formating options are defined in this dataset using this feature:
A.showFormats() format level template lex-orig-plain word {lemma}{trailer} lex-translit-plain word {lextranslit}{trailer} text-orig-full word {before}{text}{after} text-orig-plain word {text}{trailer} text-translit-plain word {translit}{trailer} text-unaccent-plain word {unaccent}{trailer}
The after
feature is based on the XML attribute after
of the w
(word) tag.