sig_1
A selection of 1 documents, created by jhermes
VM Text
ri-203928311
15dc523f-fd0a-42e8-8528-cb65e7c0e665
15dc523f-fd0a-42e8-8528-cb65e7c0e665
de.uni_koeln.spinfo.tesla.component.reader.VoynichInterlinearArchiveReader
de.uni_koeln.spinfo.tesla.roles.labeler.voynich.VoynichTextLabelsImpl
de.uni_koeln.spinfo.tesla.roles.labeler.voynich.VoynichTextLabelAccessAdapterImpl
de.uni_koeln.spinfo.tesla.annotation.adapter.hibernate.DefaultHibernateOutputAdapter
-203928311
Voynich Text Label Generator
General information about this role: Generates labels on voynich text (section, currier language, page identifier).
If true, filler tokens will be deleted.
true
false
If true, comment tokens will be deleted.
false
false
If true, content will be transcripted to Currier Alphabet. Eva Alphabet otherwise.
false
false
If true, all transcriptions will be added to the signal content. If false, only Takaheshis transcription will be added (if no other one choosed)
false
false
Transcription of the VM to choose.
Takahashi
Currier
Friedman
Tiltman
Latham
Roe
Kluge
Reed
Currier2
Friedman2
Kluge2
Reed2
Latham
Takahashi
Landini
Stolfi
Grove
Petersen
Mardle
Zandbergen
false
Section of the VM to choose. See http://www.voynich.nu/descr.html .
All
All
Text
Herbal
Astronomical
Zodiac
Biological
Cosmological
Pharmaceutical
Stars
false
Currier Languages to choose. See http://www.voynich.nu/extra/curabcd.html .
Both
Both
CurrierA
CurrierB
false
If false, this component will be executed whenever used in an experiment. If true, the annotations produced by this component earlier will be reused if the execution prerequesites did not change.
false
false
Jürgen Hermes
jhermes@spinfo.uni-koeln.de
Sprachliche Informationsverarbeitung
http://www.phil-fak.uni-koeln.de/spinfo-juergenhermes.html
VM Reader
http://voynich.freie-literatur.de/index.php?show=extractor
java.lang.String
de.uni_koeln.spinfo.tesla.component.spre.SPre2Component
de.uni_koeln.spinfo.tesla.roles.core.impl.hibernate.data.Token
de.uni_koeln.spinfo.tesla.roles.tokenizer.impl.tunguska.access.TTokenizerAccessAdapter
de.uni_koeln.spinfo.tesla.annotation.adapter.tunguska.DefaultTunguskaOutputAdapter$ProtoStuff
2034831580
Tokenizer
General information about this role: Detects linguistic tokens.
de.uni_koeln.spinfo.tesla.roles.tokenizer.impl.hibernate.data.Sentence
de.uni_koeln.spinfo.tesla.roles.tokenizer.impl.tunguska.access.TSentenceTokenAccessAdapter
de.uni_koeln.spinfo.tesla.annotation.adapter.tunguska.DefaultTunguskaOutputAdapter$ProtoStuff
-166323541
Sentence Detector
General information about this role: Detects sentence boundaries.
Configurations for the SPre Character parser
<?xml version="1.0" encoding="UTF-8"?>
<spre:characterParser
xmlns:xs="http://www.w3.org/2001/XMLSchema-instance"
xmlns:spre="http://spinfo.uni_koeln.de/spre/xmlSchema/SPreCharacterParser"
xs:schemaLocation="http://spinfo.uni_koeln.de/spre/xmlSchema/SPreCharacterParser SPreCharacterParser.xsd">
<spre:layer>CharacterLayer</spre:layer>
<spre:tokens>
<!-- ++++++++++++ -->
<!-- Letter items -->
<!-- ++++++++++++ -->
<spre:token name="a_min">a</spre:token>
<spre:token name="b_min">b</spre:token>
<spre:token name="c_min">c</spre:token>
<spre:token name="d_min">d</spre:token>
<spre:token name="e_min">e</spre:token>
<spre:token name="f_min">f</spre:token>
<spre:token name="g_min">g</spre:token>
<spre:token name="h_min">h</spre:token>
<spre:token name="i_min">i</spre:token>
<spre:token name="j_min">j</spre:token>
<spre:token name="k_min">k</spre:token>
<spre:token name="l_min">l</spre:token>
<spre:token name="m_min">m</spre:token>
<spre:token name="n_min">n</spre:token>
<spre:token name="o_min">o</spre:token>
<spre:token name="p_min">p</spre:token>
<spre:token name="q_min">q</spre:token>
<spre:token name="r_min">r</spre:token>
<spre:token name="s_min">s</spre:token>
<spre:token name="t_min">t</spre:token>
<spre:token name="u_min">u</spre:token>
<spre:token name="v_min">v</spre:token>
<spre:token name="w_min">w</spre:token>
<spre:token name="x_min">x</spre:token>
<spre:token name="y_min">y</spre:token>
<spre:token name="z_min">z</spre:token>
<spre:token name="a_maj">A</spre:token>
<spre:token name="b_maj">B</spre:token>
<spre:token name="c_maj">C</spre:token>
<spre:token name="d_maj">D</spre:token>
<spre:token name="e_maj">E</spre:token>
<spre:token name="f_maj">F</spre:token>
<spre:token name="g_maj">G</spre:token>
<spre:token name="h_maj">H</spre:token>
<spre:token name="i_maj">I</spre:token>
<spre:token name="j_maj">J</spre:token>
<spre:token name="k_maj">K</spre:token>
<spre:token name="l_maj">L</spre:token>
<spre:token name="m_maj">M</spre:token>
<spre:token name="n_maj">N</spre:token>
<spre:token name="o_maj">O</spre:token>
<spre:token name="p_maj">P</spre:token>
<spre:token name="q_maj">Q</spre:token>
<spre:token name="r_maj">R</spre:token>
<spre:token name="s_maj">S</spre:token>
<spre:token name="t_maj">T</spre:token>
<spre:token name="u_maj">U</spre:token>
<spre:token name="v_maj">V</spre:token>
<spre:token name="w_maj">W</spre:token>
<spre:token name="x_maj">X</spre:token>
<spre:token name="y_maj">Y</spre:token>
<spre:token name="z_maj">Z</spre:token>
<!-- +++++++++++ -->
<!-- Digit items -->
<!-- +++++++++++ -->
<spre:token name="Null">0</spre:token>
<spre:token name="One">1</spre:token>
<spre:token name="Two">2</spre:token>
<spre:token name="Three">3</spre:token>
<spre:token name="Four">4</spre:token>
<spre:token name="Five">5</spre:token>
<spre:token name="Six">6</spre:token>
<spre:token name="Seven">7</spre:token>
<spre:token name="Eight">8</spre:token>
<spre:token name="Nine">9</spre:token>
<!-- +++++++++++++++++ -->
<!-- Other items -->
<!-- +++++++++++++++++ -->
<spre:token name="Dot">.</spre:token>
<spre:token name="QuestionMark">?</spre:token>
<spre:token name="ExclamationMark">!</spre:token>
<spre:token name="Comma">,</spre:token>
<spre:token name="Colon">:</spre:token>
<spre:token name="SemiColon">;</spre:token>
<spre:token name="Hyphen">-</spre:token>
<spre:token name="BraceOpen">{</spre:token>
<spre:token name="BraceClose">}</spre:token>
<spre:token name="LowerThan"><</spre:token>
<spre:token name="GreaterThan">></spre:token>
<spre:token name="Apostroph">'</spre:token>
<spre:token name="Quotation">"</spre:token>
<spre:token name="Backslash">\</spre:token>
<spre:token name="Percent">%</spre:token>
<!-- <spre:token name="Ampersand">&</spre:token>-->
<spre:token name="Equals">=</spre:token>
<spre:token name="Asterisk">*</spre:token>
<spre:token name="Plus">+</spre:token>
<!-- <spre:token name="Sharp">#</spre:token>-->
<!-- +++++++++++++++++ -->
<!-- Added items (not in SPre German)-->
<!-- +++++++++++++++++ -->
<spre:token name="Dollar">$</spre:token>
<!-- <spre:token name="Up">^</spre:token>
<spre:token name="Pipe">|</spre:token>
<spre:token name="Blinky">¤</spre:token>
<spre:token name="a_maj_aigu">À</spre:token>-->
</spre:tokens>
<spre:tokenClasses>
<spre:tokenClass name="Alphanumeric">
<spre:item>Letter</spre:item>
<spre:item>Digit</spre:item>
</spre:tokenClass>
<spre:tokenClass name="LowerCaseLetter">
<spre:item>a_min</spre:item>
<spre:item>b_min</spre:item>
<spre:item>c_min</spre:item>
<spre:item>d_min</spre:item>
<spre:item>e_min</spre:item>
<spre:item>f_min</spre:item>
<spre:item>g_min</spre:item>
<spre:item>h_min</spre:item>
<spre:item>i_min</spre:item>
<spre:item>j_min</spre:item>
<spre:item>k_min</spre:item>
<spre:item>l_min</spre:item>
<spre:item>m_min</spre:item>
<spre:item>n_min</spre:item>
<spre:item>o_min</spre:item>
<spre:item>p_min</spre:item>
<spre:item>q_min</spre:item>
<spre:item>r_min</spre:item>
<spre:item>s_min</spre:item>
<spre:item>t_min</spre:item>
<spre:item>u_min</spre:item>
<spre:item>v_min</spre:item>
<spre:item>w_min</spre:item>
<spre:item>x_min</spre:item>
<spre:item>y_min</spre:item>
<spre:item>z_min</spre:item>
</spre:tokenClass>
<spre:tokenClass name="CapitalLetter">
<spre:item>a_maj</spre:item>
<spre:item>b_maj</spre:item>
<spre:item>c_maj</spre:item>
<spre:item>d_maj</spre:item>
<spre:item>e_maj</spre:item>
<spre:item>f_maj</spre:item>
<spre:item>g_maj</spre:item>
<spre:item>h_maj</spre:item>
<spre:item>i_maj</spre:item>
<spre:item>j_maj</spre:item>
<spre:item>k_maj</spre:item>
<spre:item>l_maj</spre:item>
<spre:item>m_maj</spre:item>
<spre:item>n_maj</spre:item>
<spre:item>o_maj</spre:item>
<spre:item>p_maj</spre:item>
<spre:item>q_maj</spre:item>
<spre:item>r_maj</spre:item>
<spre:item>s_maj</spre:item>
<spre:item>t_maj</spre:item>
<spre:item>u_maj</spre:item>
<spre:item>v_maj</spre:item>
<spre:item>w_maj</spre:item>
<spre:item>x_maj</spre:item>
<spre:item>y_maj</spre:item>
<spre:item>z_maj</spre:item>
</spre:tokenClass>
<spre:tokenClass name="NotIdentifiedLetter">
<spre:item>Asterisk</spre:item>
</spre:tokenClass>
<spre:tokenClass name="Digit">
<spre:item>Null</spre:item>
<spre:item>One</spre:item>
<spre:item>Two</spre:item>
<spre:item>Three</spre:item>
<spre:item>Four</spre:item>
<spre:item>Five</spre:item>
<spre:item>Six</spre:item>
<spre:item>Seven</spre:item>
<spre:item>Eight</spre:item>
<spre:item>Nine</spre:item>
</spre:tokenClass>
<spre:tokenClass name="Letter">
<spre:item>LowerCaseLetter</spre:item>
<spre:item>CapitalLetter</spre:item>
<spre:item>NotIdentifiedLetter</spre:item>
</spre:tokenClass>
<!-- +++++++++++++++++ -->
<!-- Breaks-->
<!-- +++++++++++++++++ -->
<spre:tokenClass name="Breaks">
<spre:item>WordBreaks</spre:item>
<spre:item>ParagraphBreak</spre:item>
</spre:tokenClass>
<spre:tokenClass name="WordBreaks">
<spre:item>DefiniteWordBreak</spre:item>
<spre:item>DubiousWordBreak</spre:item>
<spre:item>LineBreak</spre:item>
</spre:tokenClass>
<spre:tokenClass name="DefiniteWordBreak">
<spre:item>Dot</spre:item>
</spre:tokenClass>
<spre:tokenClass name="DubiousWordBreak">
<spre:item>Comma</spre:item>
</spre:tokenClass>
<spre:tokenClass name="LineBreak">
<spre:item>Hyphen</spre:item>
</spre:tokenClass>
<spre:tokenClass name="ParagraphBreak">
<spre:item>Equals</spre:item>
</spre:tokenClass>
<spre:tokenClass name="Comment">
<spre:item>CommentOpen</spre:item>
<spre:item>CommentClose</spre:item>
</spre:tokenClass>
<spre:tokenClass name="CommentOpen">
<spre:item>BraceOpen</spre:item>
</spre:tokenClass>
<spre:tokenClass name="CommentClose">
<spre:item>BraceClose</spre:item>
</spre:tokenClass>
<spre:tokenClass name="Fillers">
<spre:item>ExclamationMark</spre:item>
<spre:item>Percent</spre:item>
</spre:tokenClass>
<!-- <spre:tokenClass name="Other">
<spre:item>QuestionMark"</spre:item>
<spre:item>Colon</spre:item>
<spre:item>LowerThan</spre:item>
<spre:item>GreaterThan</spre:item>
<spre:item>Apostroph</spre:item>
<spre:item>Quotation</spre:item>
<spre:item>Backslash</spre:item>
<spre:item>Ampersand</spre:item>
<spre:item>Plus</spre:item>
<spre:item>Sharp</spre:item>
<spre:item>Dollar</spre:item>
<spre:item>Up</spre:item>
<spre:item>Pipe</spre:item>
<spre:item>Blinky</spre:item>
<spre:item>a_maj_aigu</spre:item>
</spre:tokenClass>-->
</spre:tokenClasses>
</spre:characterParser>
Character parser for VM texts (Value from Template "Voynich Character Parser" of Template Set "VM Preprocessor")
false
Configurations for the SPre parser based on the character parser
<?xml version="1.0" encoding="UTF-8"?>
<spre:defaultParser
xmlns:xs="http://www.w3.org/2001/XMLSchema-instance"
xmlns:spre="http://spinfo.uni_koeln.de/spre/xmlSchema/SPreDefaultParser"
xs:schemaLocation="http://spinfo.uni_koeln.de/spre/xmlSchema/SPreDefaultParser SPreDefaultParser.xsd">
<!-- ************* -->
<!-- 1. The tokens -->
<!-- ************* -->
<spre:layer>WordLayer</spre:layer>
<spre:tokens>
<!-- The tokenClass Unprocessable will always be generated by the
characterParser. It's a bit problematic that the name is only fixed
on the level of the source code. -->
<spre:token name="UnprocessableTokenSequence">
<spre:pattern>
<spre:startsWith>Unprocessable</spre:startsWith>
<spre:contains>Unprocessable</spre:contains>
</spre:pattern>
</spre:token>
<spre:token name="UnprocessableToken">
<spre:pattern>
<spre:containsOnly>Unprocessable</spre:containsOnly>
</spre:pattern>
</spre:token>
<!-- ******************** -->
<!-- 1.1 The "Comment" token -->
<!-- ******************** -->
<spre:token name="Comment">
<spre:pattern>
<spre:startsWith>CommentOpen</spre:startsWith>
<spre:endsWith>CommentClose</spre:endsWith>
</spre:pattern>
</spre:token>
<!-- ******************** -->
<!-- 1.2 The "Word" token -->
<!-- ******************** -->
<spre:token name="Word">
<spre:pattern>
<spre:startsWith>Alphanumeric</spre:startsWith>
<spre:startsWith>Fillers</spre:startsWith>
<spre:contains>Alphanumeric</spre:contains>
<spre:contains>Fillers</spre:contains>
</spre:pattern>
</spre:token>
<!-- **************** -->
<!-- 1.3 Other tokens -->
<!-- **************** -->
<spre:token name="ParagraphEnd">
<spre:pattern>
<spre:containsOnly>ParagraphBreak</spre:containsOnly>
</spre:pattern>
</spre:token>
</spre:tokens>
<!-- ******************* -->
<!-- 2. The tokenClasses -->
<!-- ******************* -->
<spre:tokenClasses>
<spre:tokenClass name="Unprocessable">
<spre:item>UnprocessableToken</spre:item>
<spre:item>UnprocessableTokenSequence</spre:item>
</spre:tokenClass>
</spre:tokenClasses>
</spre:defaultParser>
Word parser for VM texts (Value from Template "Voynich Word Parser" of Template Set "VM Preprocessor")
false
Configurations for the SPre parser based on the secondary parser
<?xml version="1.0" encoding="UTF-8"?>
<spre:defaultParser
xmlns:xs="http://www.w3.org/2001/XMLSchema-instance"
xmlns:spre="http://spinfo.uni_koeln.de/spre/xmlSchema/SPreDefaultParser"
xs:schemaLocation="http://spinfo.uni_koeln.de/spre/xmlSchema/SPreDefaultParser SPreDefaultParser.xsd">
<spre:layer>ParagraphLayer</spre:layer>
<!-- ************* -->
<!-- 1. The tokens -->
<!-- ************* -->
<spre:tokens>
<!-- The tokenClass Unprocessable will always be generated by the
characterParser. It's a bit problematic that the name is only fixed
on the level of the source code. -->
<spre:token name="UnprocessableTokenSequence">
<spre:pattern>
<spre:startsWith>Unprocessable</spre:startsWith>
<spre:contains>Unprocessable</spre:contains>
</spre:pattern>
</spre:token>
<spre:token name="UnprocessableToken">
<spre:pattern>
<spre:containsOnly>Unprocessable</spre:containsOnly>
</spre:pattern>
</spre:token>
<!-- ******************** -->
<!-- 1.1 The "Paragraph" token -->
<!-- ******************** -->
<spre:token name="Paragraph">
<spre:pattern>
<spre:startsWith>Word</spre:startsWith>
<spre:startsWith>Comment</spre:startsWith>
<spre:endsWith>ParagraphEnd</spre:endsWith>
</spre:pattern>
</spre:token>
</spre:tokens>
<spre:tokenClasses>
<spre:tokenClass name="Unprocessable">
<spre:item>UnprocessableToken</spre:item>
<spre:item>UnprocessableTokenSequence</spre:item>
</spre:tokenClass>
</spre:tokenClasses>
</spre:defaultParser>
Paragraph parser for VM texts (Value from Template "Voynich Paragraph Parser" of Template Set "VM Preprocessor")
false
If false, this component will be executed whenever used in an experiment. If true, the annotations produced by this component earlier will be reused if the execution prerequesites did not change.
false
false
Jürgen Hermes
jhermes@spinfo.uni-koeln.de
Sprachliche Informationsverarbeitung
http://www.phil-fak.uni-koeln.de/spinfo-juergenhermes.html
Christoph Benden
cbenden@spinfo.uni-koeln.de
Sprachliche Informationsverarbeitung
No external URL defined
A configurable layered tokenizer.
No external URL defined
de.uni_koeln.spinfo.formanalysis.teslacomponents.SimpleMorphemizerComponent
de.uni_koeln.spinfo.tesla.roles.core.impl.hibernate.data.Token
de.uni_koeln.spinfo.tesla.roles.tokenizer.impl.tunguska.access.TTokenizerAccessAdapter
de.uni_koeln.spinfo.tesla.annotation.adapter.tunguska.DefaultTunguskaOutputAdapter
2073918193
Morphemizer
General information about this role: Detects graphems using sucessor/predecessor counts. Provides access on words morphemes.
-
Tokenizer
Detects linguistic tokens.
de.uni_koeln.spinfo.tesla.roles.tokenizer.Tokenizer
7e883193-7739-46c5-bcc8-e7b627cfde36
2034831580
de.uni_koeln.spinfo.tesla.roles.tokenizer.access.ITokenAccessAdapter
de.uni_koeln.spinfo.tesla.roles.tokenizer.data.IToken
Determines the minimal length of words that should be analysed
3
false
Determines the minimal score for morpheme boundaries
3
false
If true, words will be splittet to morphemes solely at the maximum evidence value
true
false
Count of re-analyses with detected morphemes treated as words
1
false
If true, accepted morpheme partitions must have a prefix in first, infix in mid and suffix in last position.
true
false
If true, increasing predecessor counts score
true
false
If true, increasing successor counts score
true
false
If true, lokal maximum predecessor counts score
true
false
If true, maximum predecessor counts score
true
false
If true, lokal maximum successor counts score
true
false
If true, maximum succecessor counts score
true
false
If true, lokal maximum combined (successor + predecessor) counts score
true
false
If true, maximum combined (successor + predecessor) counts score
true
false
if true, the word labels will be analysed instead of the signal content.
false
false
Determines how many occurences a type should have at least to be analyzed
3
false
If false, this component will be executed whenever used in an experiment. If true, the annotations produced by this component earlier will be reused if the execution prerequesites did not change.
false
false
Jürgen Hermes
jhermes@spinfo.uni-koeln.de
Sprachliche Informationsverarbeitung
none
Detects Morphemes using sucessor and predecessor counts.
none
Morpheme analysis VM
Experiment Description
jhermes
hermesj@uni-koeln.de
none