Text-Fabric dataset of the Greek New Testament, based on the Nestle 1904 (7th printing) edition.
About this datasetThis Text-Fabric database offers its users two distinct view types to display the syntaxtrees. This is possible due to a partial data duplication using dedicated node types that are associated with each of these view types. While most features are associated with both view types, some features are specifically tuned to a particular view type, for example, by using matching or dedicated nomenclature. The association between node type and view type is shown in the following table.
Viewtype | Display syntax tree | Invocation | Associated node types |
---|---|---|---|
wg-view |
In agnostic terms like word groups | A.viewtype(‘wg’) | wg |
syntax-view |
In linguistic terms like phrases and clauses | A.viewtype(‘syntax’) | subphrase phrase clause group |
Note: the node types word
, sentence
, verse
, chapter
, and book
are common for both views.
Note that it is also possible to reset the display and show all nodes by entering the the command A.viewtype(‘reset’).
The data duplication not only impacts the representation of syntax trees, controlled by setting the view type, but it also impacts the creation of queries. Although the Text-Fabric dataset provides access to all nodes and features at all times, it is important to formulate syntactic queries that match either one of these data structures. The following figure provides two functionally equivalent queries:
Both queries examine instances where ‘fire’ is thrown down, focusing on the preposition used by specifying ‘prep’ instead of a lexeme. These queries respectively search for clauses or word groups that contain the verb βάλλω (‘to throw down’) and a complement phrase or adverbial word group with the lemma πῦρ (‘fire’). Both queries yield the same verses (Matthew 3:10; 7:19, Mark 9:22, Luke 3:9) and words but return different tuple values. The differences arise from the query templates, differing only in the first and third lines (‘clause’ or ‘phrase’ vs. ‘wg’), affecting the first and third tuple elements. Note that a tuple in Python is defined as an immutable, ordered collection of elements.
Understanding the distinctions between these two views is especialy important when building queries that involve parent-child relations. E.g. when using the edge features parent and sibling. See following image for details:
This image compares the parent (arrows) and sibling features (connector with circle) for the first phrase of the book of John (John 1:1) for the wg-view
and the syntax-view
for the data. The parent feature for a specific node can be obtained using E.parent.f(node) and the sibling feature can be calculated using E.sibling.b(node), where node stands for the number of the node. The direction of the arrow indicates the parent node of a given node. The dotted lines indicate that the wg
nodes share the same data as the sentence
, clause
, and phrase
. The subphrase
, verse
, and chapter
nodes are not nested in the calculation of the parent and sibling features.
wg-view | syntax-view |
---|---|
feature cls | feature typ |
adjp | AdjP |
advp | AdvP |
np | NP |
vp | VP |
feature typems | feature typ |
conjugated-wg | conjuncted |
apposition-group | apposition |
feature role | feature function |
io | Cmpl |
o | Objc |
o2 | Objc |
p | PreC |
s | Subj |
vc (for wg node) | Pred |
v (for word node) | Pred |
apposition | Appo |
The implementation of this viewtype concept is done by adding a small portion of Python code to the app’s app.py file. The function of this file is to allows for functional enhancements which are required to effectively handle a corpus. The viewtypes are defined by adding labels to various node types, as specifies in the config.yaml file. After loading all corpus data and creating the API object, A.viewtype(‘syntax’) is called in order to set viewtype to ‘syntax’, making it the de facto default viewtype.
If for some reason it is necessary to display all nodes, the command A.viewtype(‘reset’) can be issued. This also resets all node labels to their definitions found in the config.yaml file.