N1904-TF

Text-Fabric dataset of the Greek New Testament, based on the Nestle 1904 (7th printing) edition.

About this dataset
Transcription
Featureset
Optional features
Viewtypes
Textformats
Syntaxtrees
Tutorial
Latest release

Nestle 1904 GNT - Feature: ref

Feature group Feature type Data type Available for node types Used by viewtypes
Sectional Node String word subphrase phrase syntax-view wg-view

Feature description

The ref feature provides a unique identifier for each individual word in the corpus.

This feature is also populated for phrase or subphrase, but only if they consist of just one word node.

Feature values

A compound string indicating book, chapter, verse and sequence number of the word inside the verse formatted as follows:

  MAT 1:2!11

This format consists of:

Example

To extract all components in this feature using Python, the following code snippet can be used:

    ref = "MAT 1:2!11" # example content
    # Regular expression pattern to match the book, chapter, verse, and position of the word in the verse
    pattern = r"(\w+)\s(\d+):(\d+)!(\d+)"
    # Using re.match to extract the parts based on the pattern
    match = re.match(pattern, ref)
    book, chapter, verse, positionInVerse = match.groups()

Notes

This first three characters of this feature value are identical to the feature bookshort.

The feature id contains identical information as the feature ref, albeit in a different format.

Source description

The identifier is based on the XML attribute ref of the w (word) tag.


Browse all features by name, node type, data type, feature group or feature type.