Issues with importing textgrind #493
Replies: 6 comments
-
|
So you are referring to the [import.config.tier_groups]
tok = ["pos", "lemma"]This means that in the textgrid files the annotation tiers Does this explain it? Do you have a minimal example file I could help you with? Just a side note: Annatto uses graphANNIS internally, a graph database (and model) that replaces Salt. There are basically just nodes and edges. You can find more details (if you like) in the documentation. |
Beta Was this translation helpful? Give feedback.
-
|
Thank you very much for your answers. Yes I'm referring to the tier_groups configuration. After some tests, if I understand correctly, the declaration Produce nodes of type sDocumentStructure:SToken for each annotation in the tier Furthermore, if I create other entries, for instance a morph entry: It creates two independant timelines, with their set of sDocumentStructure:SAudioRelation, for linking the two set of tokens (from tok and from morph) to the timeline ; and two sTextualDS with their set of sDocumentStructure:STextualRelation ? If I can ask some more questions:
Thank you very much for your help. |
Beta Was this translation helpful? Give feedback.
-
|
Correction:
No, it creates one STimeline (with the nodes from tok and morph anchored to it) and two sDocumentStructure:STextualDS (one for tok, the other for morph). |
Beta Was this translation helpful? Give feedback.
-
|
Salt is indeed discontinued (the existence of the importer and exporter for saltxml is for supporting some use cases where the Salt representation was necessary, but it's basically like any other format now: just a format) and thus no STimeline, STextualRelation, STextualDS or any other SaltType is created. The only place were Salt is still used is in ANNIS' Visualizers. The visualizers get their input as a SaltGraph which is derived from a graphANNIS graph. The latter is a different and more generic graph model. Nodes have a name and a type (usually There are some ideas/concepts that originate in Salt and are sometimes mimicked in graphANNIS graphs. As for STokens, there is no such thing in graphANNIS. Nodes can be ordered via edges of component type So from that a timeline in graphANNIS is usually a set of nodes of type [[import]]
format = "textgrid"
path = "..."
[import.config.tier_groups]
text = ["pos", "lemma"]
morph = ["morph_pos"]This will (roughly) have the following consequences:
If you do not want a tier to lose it's own timestamps, you can easily make it a key in the tier group configuration and "hide" the dependencies between the tiers. You can basically do that with any tier, for the above configuration you could do something like this: [[import]]
format = "textgrid"
path = "..."
[import.config.tier_groups]
text = []
pos = []
lemma = []
morph = []
morph_pos = []or, alternatively, not provide a tier grouping, so all timestamps are valid and all tiers are mapped into the graph independently of each other. On the documentation in general: These things should be stated clearly there, I agree, but as for now the graphANNIS documentation has to serve for this purpose. We are planning to extend the documentation once certain standards are set and all importers create their graphs in the same fashion (just as an example, some importers create orderings like the above, others would use Please let me know if you have any further questions. |
Beta Was this translation helpful? Give feedback.
-
|
Thank you so much. The explanation you gave here could be included in the documentation of the textgrid importer. When you say:
do you mean that the intervals in, say, Thank you very much for the explanations about the new graphannis data model. I was indeed using the saltxml exporter to inspect the result of the conversion. Is there an exporter that produce a sort of official/straighforward serialisation of the graphannis data model? |
Beta Was this translation helpful? Give feedback.
-
|
For expecting the graph I generally recommend using the Regarding your question which nodes are targeted, here comes the tricky part: The standard way of doing it is to always target timeline nodes, but there are some exceptions or let's say there is generally the option to actually target the About the timestamps, let's assume the following minimal example only containing a
Now imagine you set up the import tier groups as follows: text = ["sentence"]As only Now, when importing the |
|||||||||||||||||||||||||||
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Originally posted by @sylvainloiseau in #447
Beta Was this translation helpful? Give feedback.
All reactions