sig_1
A selection of 1 documents, created by jhermes
VM Text
ri101429920
8d1f6a83-15b1-4c82-8f83-6aa777dcb4ce
8d1f6a83-15b1-4c82-8f83-6aa777dcb4ce
de.uni_koeln.spinfo.tesla.component.reader.VoynichInterlinearArchiveReader
de.uni_koeln.spinfo.tesla.roles.labeler.voynich.VoynichTextLabelsImpl
de.uni_koeln.spinfo.tesla.roles.labeler.voynich.VoynichTextLabelAccessAdapterImpl
de.uni_koeln.spinfo.tesla.annotation.adapter.hibernate.DefaultHibernateOutputAdapter
101429920
Voynich Text Label Generator
General information about this role: Generates labels on voynich text (section, currier language, page identifier).
If true, filler tokens will be deleted.
true
false
If true, comment tokens will be deleted.
false
false
If true, content will be transcripted to Currier Alphabet. Eva Alphabet otherwise.
false
false
If true, all transcriptions will be added to the signal content. If false, only Takaheshis transcription will be added (if no other one choosed)
false
false
Transcription of the VM to choose.
Takahashi
Currier
Friedman
Tiltman
Latham
Roe
Kluge
Reed
Currier2
Friedman2
Kluge2
Reed2
Latham
Takahashi
Landini
Stolfi
Grove
Petersen
Mardle
Zandbergen
false
Section of the VM to choose. See http://www.voynich.nu/descr.html .
All
All
Text
Herbal
Astronomical
Zodiac
Biological
Cosmological
Pharmaceutical
Stars
false
Currier Languages to choose. See http://www.voynich.nu/extra/curabcd.html .
Both
Both
CurrierA
CurrierB
false
If false, this component will be executed whenever used in an experiment. If true, the annotations produced by this component earlier will be reused if the execution prerequesites did not change.
true
false
Jürgen Hermes
jhermes@spinfo.uni-koeln.de
Sprachliche Informationsverarbeitung
http://www.phil-fak.uni-koeln.de/spinfo-juergenhermes.html
VM Reader
http://voynich.freie-literatur.de/index.php?show=extractor
java.lang.String
de.uni_koeln.spinfo.tesla.component.spre.SPre2Component
de.uni_koeln.spinfo.tesla.roles.core.impl.hibernate.data.Token
de.uni_koeln.spinfo.tesla.roles.tokenizer.impl.tunguska.access.TTokenizerAccessAdapter
de.uni_koeln.spinfo.tesla.annotation.adapter.tunguska.DefaultTunguskaOutputAdapter$ProtoStuff
-1314117685
Tokenizer
General information about this role: Detects linguistic tokens.
de.uni_koeln.spinfo.tesla.roles.tokenizer.impl.hibernate.data.Sentence
de.uni_koeln.spinfo.tesla.roles.tokenizer.impl.tunguska.access.TSentenceTokenAccessAdapter
de.uni_koeln.spinfo.tesla.annotation.adapter.tunguska.DefaultTunguskaOutputAdapter$ProtoStuff
-898040694
Sentence Detector
General information about this role: Detects sentence boundaries.
Configurations for the SPre Character parser
<?xml version="1.0" encoding="UTF-8"?>
<spre:characterParser
xmlns:xs="http://www.w3.org/2001/XMLSchema-instance"
xmlns:spre="http://spinfo.uni_koeln.de/spre/xmlSchema/SPreCharacterParser"
xs:schemaLocation="http://spinfo.uni_koeln.de/spre/xmlSchema/SPreCharacterParser SPreCharacterParser.xsd">
<spre:layer>CharacterLayer</spre:layer>
<spre:tokens>
<!-- ++++++++++++ -->
<!-- Letter items -->
<!-- ++++++++++++ -->
<spre:token name="a_min">a</spre:token>
<spre:token name="b_min">b</spre:token>
<spre:token name="c_min">c</spre:token>
<spre:token name="d_min">d</spre:token>
<spre:token name="e_min">e</spre:token>
<spre:token name="f_min">f</spre:token>
<spre:token name="g_min">g</spre:token>
<spre:token name="h_min">h</spre:token>
<spre:token name="i_min">i</spre:token>
<spre:token name="j_min">j</spre:token>
<spre:token name="k_min">k</spre:token>
<spre:token name="l_min">l</spre:token>
<spre:token name="m_min">m</spre:token>
<spre:token name="n_min">n</spre:token>
<spre:token name="o_min">o</spre:token>
<spre:token name="p_min">p</spre:token>
<spre:token name="q_min">q</spre:token>
<spre:token name="r_min">r</spre:token>
<spre:token name="s_min">s</spre:token>
<spre:token name="t_min">t</spre:token>
<spre:token name="u_min">u</spre:token>
<spre:token name="v_min">v</spre:token>
<spre:token name="w_min">w</spre:token>
<spre:token name="x_min">x</spre:token>
<spre:token name="y_min">y</spre:token>
<spre:token name="z_min">z</spre:token>
<spre:token name="a_maj">A</spre:token>
<spre:token name="b_maj">B</spre:token>
<spre:token name="c_maj">C</spre:token>
<spre:token name="d_maj">D</spre:token>
<spre:token name="e_maj">E</spre:token>
<spre:token name="f_maj">F</spre:token>
<spre:token name="g_maj">G</spre:token>
<spre:token name="h_maj">H</spre:token>
<spre:token name="i_maj">I</spre:token>
<spre:token name="j_maj">J</spre:token>
<spre:token name="k_maj">K</spre:token>
<spre:token name="l_maj">L</spre:token>
<spre:token name="m_maj">M</spre:token>
<spre:token name="n_maj">N</spre:token>
<spre:token name="o_maj">O</spre:token>
<spre:token name="p_maj">P</spre:token>
<spre:token name="q_maj">Q</spre:token>
<spre:token name="r_maj">R</spre:token>
<spre:token name="s_maj">S</spre:token>
<spre:token name="t_maj">T</spre:token>
<spre:token name="u_maj">U</spre:token>
<spre:token name="v_maj">V</spre:token>
<spre:token name="w_maj">W</spre:token>
<spre:token name="x_maj">X</spre:token>
<spre:token name="y_maj">Y</spre:token>
<spre:token name="z_maj">Z</spre:token>
<!-- +++++++++++ -->
<!-- Digit items -->
<!-- +++++++++++ -->
<spre:token name="Null">0</spre:token>
<spre:token name="One">1</spre:token>
<spre:token name="Two">2</spre:token>
<spre:token name="Three">3</spre:token>
<spre:token name="Four">4</spre:token>
<spre:token name="Five">5</spre:token>
<spre:token name="Six">6</spre:token>
<spre:token name="Seven">7</spre:token>
<spre:token name="Eight">8</spre:token>
<spre:token name="Nine">9</spre:token>
<!-- +++++++++++++++++ -->
<!-- Other items -->
<!-- +++++++++++++++++ -->
<spre:token name="Dot">.</spre:token>
<spre:token name="QuestionMark">?</spre:token>
<spre:token name="ExclamationMark">!</spre:token>
<spre:token name="Comma">,</spre:token>
<spre:token name="Colon">:</spre:token>
<spre:token name="SemiColon">;</spre:token>
<spre:token name="Hyphen">-</spre:token>
<spre:token name="BraceOpen">{</spre:token>
<spre:token name="BraceClose">}</spre:token>
<spre:token name="LowerThan"><</spre:token>
<spre:token name="GreaterThan">></spre:token>
<spre:token name="Apostroph">'</spre:token>
<spre:token name="Quotation">"</spre:token>
<spre:token name="Backslash">\</spre:token>
<spre:token name="Percent">%</spre:token>
<!-- <spre:token name="Ampersand">&</spre:token>-->
<spre:token name="Equals">=</spre:token>
<spre:token name="Asterisk">*</spre:token>
<spre:token name="Plus">+</spre:token>
<!-- <spre:token name="Sharp">#</spre:token>-->
<!-- +++++++++++++++++ -->
<!-- Added items (not in SPre German)-->
<!-- +++++++++++++++++ -->
<spre:token name="Dollar">$</spre:token>
<!-- <spre:token name="Up">^</spre:token>
<spre:token name="Pipe">|</spre:token>
<spre:token name="Blinky">¤</spre:token>
<spre:token name="a_maj_aigu">À</spre:token>-->
</spre:tokens>
<spre:tokenClasses>
<spre:tokenClass name="Alphanumeric">
<spre:item>Letter</spre:item>
<spre:item>Digit</spre:item>
</spre:tokenClass>
<spre:tokenClass name="LowerCaseLetter">
<spre:item>a_min</spre:item>
<spre:item>b_min</spre:item>
<spre:item>c_min</spre:item>
<spre:item>d_min</spre:item>
<spre:item>e_min</spre:item>
<spre:item>f_min</spre:item>
<spre:item>g_min</spre:item>
<spre:item>h_min</spre:item>
<spre:item>i_min</spre:item>
<spre:item>j_min</spre:item>
<spre:item>k_min</spre:item>
<spre:item>l_min</spre:item>
<spre:item>m_min</spre:item>
<spre:item>n_min</spre:item>
<spre:item>o_min</spre:item>
<spre:item>p_min</spre:item>
<spre:item>q_min</spre:item>
<spre:item>r_min</spre:item>
<spre:item>s_min</spre:item>
<spre:item>t_min</spre:item>
<spre:item>u_min</spre:item>
<spre:item>v_min</spre:item>
<spre:item>w_min</spre:item>
<spre:item>x_min</spre:item>
<spre:item>y_min</spre:item>
<spre:item>z_min</spre:item>
</spre:tokenClass>
<spre:tokenClass name="CapitalLetter">
<spre:item>a_maj</spre:item>
<spre:item>b_maj</spre:item>
<spre:item>c_maj</spre:item>
<spre:item>d_maj</spre:item>
<spre:item>e_maj</spre:item>
<spre:item>f_maj</spre:item>
<spre:item>g_maj</spre:item>
<spre:item>h_maj</spre:item>
<spre:item>i_maj</spre:item>
<spre:item>j_maj</spre:item>
<spre:item>k_maj</spre:item>
<spre:item>l_maj</spre:item>
<spre:item>m_maj</spre:item>
<spre:item>n_maj</spre:item>
<spre:item>o_maj</spre:item>
<spre:item>p_maj</spre:item>
<spre:item>q_maj</spre:item>
<spre:item>r_maj</spre:item>
<spre:item>s_maj</spre:item>
<spre:item>t_maj</spre:item>
<spre:item>u_maj</spre:item>
<spre:item>v_maj</spre:item>
<spre:item>w_maj</spre:item>
<spre:item>x_maj</spre:item>
<spre:item>y_maj</spre:item>
<spre:item>z_maj</spre:item>
</spre:tokenClass>
<spre:tokenClass name="NotIdentifiedLetter">
<spre:item>Asterisk</spre:item>
</spre:tokenClass>
<spre:tokenClass name="Digit">
<spre:item>Null</spre:item>
<spre:item>One</spre:item>
<spre:item>Two</spre:item>
<spre:item>Three</spre:item>
<spre:item>Four</spre:item>
<spre:item>Five</spre:item>
<spre:item>Six</spre:item>
<spre:item>Seven</spre:item>
<spre:item>Eight</spre:item>
<spre:item>Nine</spre:item>
</spre:tokenClass>
<spre:tokenClass name="Letter">
<spre:item>LowerCaseLetter</spre:item>
<spre:item>CapitalLetter</spre:item>
<spre:item>NotIdentifiedLetter</spre:item>
</spre:tokenClass>
<!-- +++++++++++++++++ -->
<!-- Breaks-->
<!-- +++++++++++++++++ -->
<spre:tokenClass name="Breaks">
<spre:item>WordBreaks</spre:item>
<spre:item>ParagraphBreak</spre:item>
</spre:tokenClass>
<spre:tokenClass name="WordBreaks">
<spre:item>DefiniteWordBreak</spre:item>
<spre:item>DubiousWordBreak</spre:item>
<spre:item>LineBreak</spre:item>
</spre:tokenClass>
<spre:tokenClass name="DefiniteWordBreak">
<spre:item>Dot</spre:item>
</spre:tokenClass>
<spre:tokenClass name="DubiousWordBreak">
<spre:item>Comma</spre:item>
</spre:tokenClass>
<spre:tokenClass name="LineBreak">
<spre:item>Hyphen</spre:item>
</spre:tokenClass>
<spre:tokenClass name="ParagraphBreak">
<spre:item>Equals</spre:item>
</spre:tokenClass>
<spre:tokenClass name="Comment">
<spre:item>CommentOpen</spre:item>
<spre:item>CommentClose</spre:item>
</spre:tokenClass>
<spre:tokenClass name="CommentOpen">
<spre:item>BraceOpen</spre:item>
</spre:tokenClass>
<spre:tokenClass name="CommentClose">
<spre:item>BraceClose</spre:item>
</spre:tokenClass>
<spre:tokenClass name="Fillers">
<spre:item>ExclamationMark</spre:item>
<spre:item>Percent</spre:item>
</spre:tokenClass>
<!-- <spre:tokenClass name="Other">
<spre:item>QuestionMark"</spre:item>
<spre:item>Colon</spre:item>
<spre:item>LowerThan</spre:item>
<spre:item>GreaterThan</spre:item>
<spre:item>Apostroph</spre:item>
<spre:item>Quotation</spre:item>
<spre:item>Backslash</spre:item>
<spre:item>Ampersand</spre:item>
<spre:item>Plus</spre:item>
<spre:item>Sharp</spre:item>
<spre:item>Dollar</spre:item>
<spre:item>Up</spre:item>
<spre:item>Pipe</spre:item>
<spre:item>Blinky</spre:item>
<spre:item>a_maj_aigu</spre:item>
</spre:tokenClass>-->
</spre:tokenClasses>
</spre:characterParser>
Character parser for VM texts (Value from Template "Voynich Character Parser" of Template Set "VM Preprocessor")
false
Configurations for the SPre parser based on the character parser
<?xml version="1.0" encoding="UTF-8"?>
<spre:defaultParser
xmlns:xs="http://www.w3.org/2001/XMLSchema-instance"
xmlns:spre="http://spinfo.uni_koeln.de/spre/xmlSchema/SPreDefaultParser"
xs:schemaLocation="http://spinfo.uni_koeln.de/spre/xmlSchema/SPreDefaultParser SPreDefaultParser.xsd">
<!-- ************* -->
<!-- 1. The tokens -->
<!-- ************* -->
<spre:layer>WordLayer</spre:layer>
<spre:tokens>
<!-- The tokenClass Unprocessable will always be generated by the
characterParser. It's a bit problematic that the name is only fixed
on the level of the source code. -->
<spre:token name="UnprocessableTokenSequence">
<spre:pattern>
<spre:startsWith>Unprocessable</spre:startsWith>
<spre:contains>Unprocessable</spre:contains>
</spre:pattern>
</spre:token>
<spre:token name="UnprocessableToken">
<spre:pattern>
<spre:containsOnly>Unprocessable</spre:containsOnly>
</spre:pattern>
</spre:token>
<!-- ******************** -->
<!-- 1.1 The "Comment" token -->
<!-- ******************** -->
<spre:token name="Comment">
<spre:pattern>
<spre:startsWith>CommentOpen</spre:startsWith>
<spre:endsWith>CommentClose</spre:endsWith>
</spre:pattern>
</spre:token>
<!-- ******************** -->
<!-- 1.2 The "Word" token -->
<!-- ******************** -->
<spre:token name="Word">
<spre:pattern>
<spre:startsWith>Alphanumeric</spre:startsWith>
<spre:startsWith>Fillers</spre:startsWith>
<spre:contains>Alphanumeric</spre:contains>
<spre:contains>Fillers</spre:contains>
</spre:pattern>
</spre:token>
<!-- **************** -->
<!-- 1.3 Other tokens -->
<!-- **************** -->
<spre:token name="ParagraphEnd">
<spre:pattern>
<spre:containsOnly>ParagraphBreak</spre:containsOnly>
</spre:pattern>
</spre:token>
</spre:tokens>
<!-- ******************* -->
<!-- 2. The tokenClasses -->
<!-- ******************* -->
<spre:tokenClasses>
<spre:tokenClass name="Unprocessable">
<spre:item>UnprocessableToken</spre:item>
<spre:item>UnprocessableTokenSequence</spre:item>
</spre:tokenClass>
</spre:tokenClasses>
</spre:defaultParser>
Word parser for VM texts (Value from Template "Voynich Word Parser" of Template Set "VM Preprocessor")
false
Configurations for the SPre parser based on the secondary parser
<?xml version="1.0" encoding="UTF-8"?>
<spre:defaultParser
xmlns:xs="http://www.w3.org/2001/XMLSchema-instance"
xmlns:spre="http://spinfo.uni_koeln.de/spre/xmlSchema/SPreDefaultParser"
xs:schemaLocation="http://spinfo.uni_koeln.de/spre/xmlSchema/SPreDefaultParser SPreDefaultParser.xsd">
<spre:layer>ParagraphLayer</spre:layer>
<!-- ************* -->
<!-- 1. The tokens -->
<!-- ************* -->
<spre:tokens>
<!-- The tokenClass Unprocessable will always be generated by the
characterParser. It's a bit problematic that the name is only fixed
on the level of the source code. -->
<spre:token name="UnprocessableTokenSequence">
<spre:pattern>
<spre:startsWith>Unprocessable</spre:startsWith>
<spre:contains>Unprocessable</spre:contains>
</spre:pattern>
</spre:token>
<spre:token name="UnprocessableToken">
<spre:pattern>
<spre:containsOnly>Unprocessable</spre:containsOnly>
</spre:pattern>
</spre:token>
<!-- ******************** -->
<!-- 1.1 The "Paragraph" token -->
<!-- ******************** -->
<spre:token name="Paragraph">
<spre:pattern>
<spre:startsWith>Word</spre:startsWith>
<spre:startsWith>Comment</spre:startsWith>
<spre:endsWith>ParagraphEnd</spre:endsWith>
</spre:pattern>
</spre:token>
</spre:tokens>
<spre:tokenClasses>
<spre:tokenClass name="Unprocessable">
<spre:item>UnprocessableToken</spre:item>
<spre:item>UnprocessableTokenSequence</spre:item>
</spre:tokenClass>
</spre:tokenClasses>
</spre:defaultParser>
Paragraph parser for VM texts (Value from Template "Voynich Paragraph Parser" of Template Set "VM Preprocessor")
false
If false, this component will be executed whenever used in an experiment. If true, the annotations produced by this component earlier will be reused if the execution prerequesites did not change.
false
false
Jürgen Hermes
jhermes@spinfo.uni-koeln.de
Sprachliche Informationsverarbeitung
http://www.phil-fak.uni-koeln.de/spinfo-juergenhermes.html
Christoph Benden
cbenden@spinfo.uni-koeln.de
Sprachliche Informationsverarbeitung
No external URL defined
A configurable layered tokenizer.
No external URL defined
corpusstatistics.CoincidenceStatisticsComponent
de.uni_koeln.spinfo.tesla.roles.labeler.corpusstats.CoincidenceStatsImpl
de.uni_koeln.spinfo.tesla.roles.labeler.corpusstats.CoincidenceStatsAccessAdapterImpl
de.uni_koeln.spinfo.tesla.annotation.adapter.hibernate.DefaultHibernateOutputAdapter
-1553213099
Coincidence Statistics Calculator
General information about this role: Calculates intra- and inter-signal coincidence statistics (kappa and chi)
-
Tokenizer
Detects linguistic tokens.
de.uni_koeln.spinfo.tesla.roles.tokenizer.Tokenizer
b1396d41-46ba-46c0-97f3-4483f8acfe1d
-1314117685
de.uni_koeln.spinfo.tesla.roles.tokenizer.access.ITokenAccessAdapter
de.uni_koeln.spinfo.tesla.roles.tokenizer.data.IToken
Select if you want to calculate the CI values for the whole selection, for each document or for document pairs.
Document Values
Selection Values
Document Values
Document Pair Values
false
if true, the statistics will calculated on the labels of each token, otherwise on the signal contents
false
false
if true, Statistics will be calculated per document, if false, per document selection
true
false
if true, all upper case letters of the signals will be replaced by their lower case counterparts
true
false
If false, this component will be executed whenever used in an experiment. If true, the annotations produced by this component earlier will be reused if the execution prerequesites did not change.
false
false
jhermes
jhermes@spinfo.uni-koeln.de
uni-koeln.ifl.spinfo
http://www.phil-fak.uni-koeln.de/spinfo-juergenhermes.html
Calculates coincidence values of texts - kappa, chi, psi, and phi
No external URL defined
corpusstatistics.CorpusStatisticsComponent
de.uni_koeln.spinfo.tesla.roles.labeler.corpusstats.CorpusStatsImpl
de.uni_koeln.spinfo.tesla.roles.labeler.corpusstats.CorpusStatsAccessAdapterImpl
de.uni_koeln.spinfo.tesla.annotation.adapter.hibernate.DefaultHibernateOutputAdapter
127959079
Corpus Statistics Calculator
General information about this role: Calculates diverse corpus statistics (zipf and entropy values, word length distribution, type-token-frequency etc.).
de.uni_koeln.spinfo.tesla.roles.vectorengine.data.impl.hibernate.IntegerArrayVector
de.uni_koeln.spinfo.tesla.roles.vectorengine.access.impl.tunguska.IntegerVectorAccessAdapter
de.uni_koeln.spinfo.tesla.annotation.adapter.tunguska.DefaultTunguskaOutputAdapter
-820044043
Integer Vector Generator
General information about this role: Generates a vector representation of the processed data in which each vector consists of integers.
de.uni_koeln.spinfo.tesla.roles.vectorengine.data.impl.hibernate.IntegerArrayVector
de.uni_koeln.spinfo.tesla.roles.vectorengine.access.impl.tunguska.IntegerVectorAccessAdapter
de.uni_koeln.spinfo.tesla.annotation.adapter.tunguska.DefaultTunguskaOutputAdapter
1191324021
Integer Vector Generator
General information about this role: Generates a vector representation of the processed data in which each vector consists of integers.
de.uni_koeln.spinfo.tesla.roles.vectorengine.data.impl.hibernate.DoubleArrayVector
de.uni_koeln.spinfo.tesla.roles.vectorengine.access.impl.tunguska.DoubleVectorAccessAdapter
de.uni_koeln.spinfo.tesla.annotation.adapter.tunguska.DefaultTunguskaOutputAdapter
-637595554
Double Vector Generator
General information about this role: Generates a vector representation of the processed data in which each vector consists of floating point numbers.
-
Tokenizer
Detects linguistic tokens.
de.uni_koeln.spinfo.tesla.roles.tokenizer.Tokenizer
b1396d41-46ba-46c0-97f3-4483f8acfe1d
-1314117685
de.uni_koeln.spinfo.tesla.roles.tokenizer.access.ITokenAccessAdapter
de.uni_koeln.spinfo.tesla.roles.tokenizer.data.IToken
Method to calculate type token relation. Choose between standard calculation (standardised with number of tokens per cohort) and position calculation (usual or Koehler-Galle style)
Standardised-1K
Position-Usual
Position-Koehler-Galle
Standardised-1K
Standardised-10K
Standardised-100K
Standardised-1M
false
Please choose between true and false. Thedefault value is true.
true
false
Please choose between true and false. Thedefault value is true.
true
false
Please choose between true and false. Thedefault value is true.
true
false
Please choose between true and false. Thedefault value is true.
true
false
Please choose between true and false. Thedefault value is true.
true
false
Please choosebetween true and false. Thedefault value is true.
true
false
Please choosebetween true and false. Thedefault value is true.
true
false
Please choose between true and false. Thedefault value is true.
true
false
if true, the statistics will calculated on the labels of each token, otherwise on the signal contents
false
false
if true, Statistics will be calculated per document, if false, per document selection
true
false
if true, all upper case letters of the signals will be replaced by their lower case counterparts
true
false
If false, this component will be executed whenever used in an experiment. If true, the annotations produced by this component earlier will be reused if the execution prerequesites did not change.
false
false
Jürgen Hermes
http://www.phil-fak.uni-koeln.de/spinfo-juergenhermes.html
CorpusStatistics, a Tesla (http://www.spinfo.uni-koeln.de/space/Forschung/Tesla) natural language processing component. Note: The calculation of the entropy values is very memory intesive. If you want to calculate document statistics,the documents should not be too large, if you want to calculate overall statistics, your corpus should not be to large.
No external URL defined
corpusstatistics.RandomWalkComponent
de.uni_koeln.spinfo.tesla.roles.vectorengine.data.impl.hibernate.DoubleArrayVector
de.uni_koeln.spinfo.tesla.roles.vectorengine.access.impl.tunguska.DoubleVectorAccessAdapter
de.uni_koeln.spinfo.tesla.annotation.adapter.tunguska.DefaultTunguskaOutputAdapter
507369493
Double Vector Generator
General information about this role: Generates a vector representation of the processed data in which each vector consists of floating point numbers.
de.uni_koeln.spinfo.tesla.roles.vectorengine.data.impl.hibernate.DoubleArrayVector
de.uni_koeln.spinfo.tesla.roles.vectorengine.access.impl.tunguska.DoubleVectorAccessAdapter
de.uni_koeln.spinfo.tesla.annotation.adapter.tunguska.DefaultTunguskaOutputAdapter
92914913
Double Vector Generator
General information about this role: Generates a vector representation of the processed data in which each vector consists of floating point numbers.
-
Tokenizer
Detects linguistic tokens.
de.uni_koeln.spinfo.tesla.roles.tokenizer.Tokenizer
b1396d41-46ba-46c0-97f3-4483f8acfe1d
-1314117685
de.uni_koeln.spinfo.tesla.roles.tokenizer.access.ITokenAccessAdapter
de.uni_koeln.spinfo.tesla.roles.tokenizer.data.IToken
Alphabet of characters that should be converted to bitsets, order is not significant
abcdefghijklmnopqrstuvwxyz
false
Upper bound of interval calculations relative to the walk length (ex: In case of value 10 the longest interval calculated has the size WalkLength/10)
1000
false
if true, the statistics will calculated on the labels of each token, otherwise on the signal contents
false
false
if true, Statistics will be calculated per document, if false, per document selection
true
false
if true, all upper case letters of the signals will be replaced by their lower case counterparts
true
false
If false, this component will be executed whenever used in an experiment. If true, the annotations produced by this component earlier will be reused if the execution prerequesites did not change.
false
false
jhermes
jhermes@spinfo.uni-koeln.de
http://www.phil-fak.uni-koeln.de/spinfo-juergenhermes.html
Performs Random Walks, calculates Long-Range Correlations
No external URL defined
corpusstatistics.RepeatedWordsDetector
de.uni_koeln.spinfo.tesla.roles.labeler.corpusstats.MultipleTokensStats
de.uni_koeln.spinfo.tesla.roles.labeler.corpusstats.MultipleTokensStatsAccessAdapterImpl
de.uni_koeln.spinfo.tesla.annotation.adapter.hibernate.DefaultHibernateOutputAdapter
-1308380994
Consecutive Multiples Detector
General information about this role: Detects occurences of same or very similar words (Levenshtein Distance smaller than 1) directly follow each other.
-
Tokenizer
Detects linguistic tokens.
de.uni_koeln.spinfo.tesla.roles.tokenizer.Tokenizer
b1396d41-46ba-46c0-97f3-4483f8acfe1d
-1314117685
de.uni_koeln.spinfo.tesla.roles.tokenizer.access.ITokenAccessAdapter
de.uni_koeln.spinfo.tesla.roles.tokenizer.data.IToken
if true, the statistics will calculated on the labels of each token, otherwise on the signal contents
false
false
if true, Statistics will be calculated per document, if false, per document selection
true
false
if true, all upper case letters of the signals will be replaced by their lower case counterparts
true
false
If false, this component will be executed whenever used in an experiment. If true, the annotations produced by this component earlier will be reused if the execution prerequesites did not change.
false
false
jhermes
jhermes@spinfo.uni-koeln.de
uni-koeln.ifl.spinfo
http://www.phil-fak.uni-koeln.de/spinfo-juergenhermes.html
Detects occurences of repeated words
No external URL defined
VM Text Statistics
Calculation of basic statistics of the vm text.
jhermes
hermesj@uni-koeln.de
none