Text-Fabric dataset of the Greek New Testament, based on the Nestle 1904 (7th printing) edition.
About this datasetFeature group | Feature type | Data type | Available for node types | Used by viewtype |
---|---|---|---|---|
Sectional |
Node |
String |
word subphrase phrase |
syntax-view wg-view |
This feature provides a unique identifier for each individual word in the corpus.
This feature is also populated for phrase
or subphrase
, but only if they consist of just one word
node.
The id
is formatted as follows:
The letter 'n' (node) followed by a 11-digit unique id in the format
BBCCCVVVWWW
BB => zero-padded book, the first NT book (Matthew) starts at 40
CCC => zero-padded chapter
VVV => zero-padded verse
WWW => zero-padded word index (instance within the verse)
The feature ref
contains identical information as the feature id
, albeit in a different format.
A related feature, which references not individual words but word groups, clauses, and sentences, is nodeid
.
The ID is derived from the XML attribute xml:id
of the w
(word) node. When loading the XML source in Python note that the the xml:id attribute is part of an XML namespace, so it should be referenced in the code using {http://www.w3.org/XML/1998/namespace}id.