=======================================Start Legend====================================================
All files are encoded in ANSI; Russian language files are encoded in UTF-8

Files: 
SpTimesPartsOfSpeech.csv	English texts, St. Petersburg Times 
SpTimesCardinalsTimeDays.csv	English texts, St. Petersburg Times

Variables Legend:

term -- word analyzed
corpus	-- collection name
TC -- term count
DF -- document frequency
k -- NB parameter k
p -- NB parameter p
POS -- part of speech description; see article for legend
semantics -- semantic description: W - days of week; T - time expressions; CRD - cardinals

=======================================================================================================

Files:

ENGmodelsKpermutedWords.csv	English texts, Rgveda - parameter k values, permuted words
ENGmodelsPpermutedWords.csv	English texts, Rgveda - parameter p values, permuted words	
ENGsmodelsKpermutedSentences.csv	English texts, Rgveda - parameter k values, permuted sentences
ENGsmodelsPpermutedSentences.csv	English texts, Rgveda - parameter p values, permuted sentences
RUmodelsKpermutedWords.csv	Russian texts, Rgveda - parameter k values, permuted words
RUmodelsPpermutedWords.csv	Russian texts, Rgveda - parameter p values, permuted words
RUsmodelsKpermutedSentences.csv	Russian texts, Rgveda - parameter k values, permuted sentences
RUsmodelsPpermutedSentences.csv	Russian texts, Rgveda - parameter p values, permuted sentences
SKmodelsKpermutedWords.csv	Sanskrit texts, Rgveda - parameter k values, permuted words
SKmodelsPpermutedWords.csv	Sanskrit texts, Rgveda - parameter p values, permuted words
SKsmodelsKpermutedSentences.csv	Sanskrit texts, Rgveda - parameter k values, permuted sentences
SKsmodelsPpermutedSentences.csv	Sanskrit texts, Rgveda - parameter p values, permuted sentences

Variables Legend:

word -- word analyzed
corpus	-- collection name
k* -- NB parameter k, k0 - real data, kn - randomized data where n is the number of randomization run
p* -- NB parameter p, p0 - real data, pn - randomized data where n is the number of randomization run
cat -- part of speech description: i - proper names; m - pronouns; f - prepositions/prefixes

=======================================End Legend======================================================

