Text-Fabric dataset of the Greek New Testament, based on the Nestle 1904 (7th printing) edition.
About this dataset| Feature group | Feature type | Data type | Available for node types | Used by viewtype |
|---|---|---|---|---|
Sectional |
Node |
String |
word subphrase phrase |
syntax-view wg-view |
This feature provides a unique identifier for each individual word in the corpus.
This feature is also populated for phrase or subphrase, but only if they consist of just one word node.
The id is formatted as follows:
The letter 'n' (node) followed by a 11-digit unique id in the format
BBCCCVVVWWW
BB => zero-padded book, the first NT book (Matthew) starts at 40
CCC => zero-padded chapter
VVV => zero-padded verse
WWW => zero-padded word index (instance within the verse)
The feature ref contains identical information as the feature id, albeit in a different format.
A related feature, which references not individual words but word groups, clauses, and sentences, is nodeid.
The ID is derived from the XML attribute xml:id of the w (word) node. When loading the XML source in Python note that the the xml:id attribute is part of an XML namespace, so it should be referenced in the code using {http://www.w3.org/XML/1998/namespace}id.