Technical Report Series on Corpus Building
|
|
- Kjeld Dalgaard
- 8 år siden
- Visninger:
Transkript
1 Technical Report Series on Corpus Building Vol. 2 (March 2013) Danish Corpora Uwe Quasthoff Dirk Goldhahn Erla Hallsteinsdóttir Abteilung Automatische Sprachverarbeitung, Institut für Informatik, Universität Leipzig
2 Affiliation oft the authors: Uwe Quasthoff und Dirk Goldhahn: Institut für Informatik,Universität Leipzig {quasthoff, Erla Hallsteinsdóttir, Institut for Sprog og Kommunikation, Syddansk Universitet Odense, Copyright: Abteilung Automatische Sprachverarbeitung, Institut für Informatik, Universität Leipzig, Technical Report Series on Corpus Building Vol. 1: Deutscher Wortschatz 2013 Vol. 2: Danish Corpora This PDF document was created using the open source tool mwlib. For more infotmation, see PDF generated at: Tue, 15 May :19:38 UTC
3 Danish corpora 1 Introduction to corpus creation 1 DAN - a processing related language description 2 DAN corpora 4 DAN corpus comparison 8 Processing details 10 Appendix to dan news 2007: Database summary 10 Appendix to dan news 2008: Database summary 10 Appendix to dan news 2010: Database summary 11 Appendix to dan news 2011: Database summary 11 Appendix to dan newscrawl 2011: Database summary 12 Appendix to dan wikipedia 2007: Database summary 12 Appendix to dan wikipedia 2012: Database summary 13 Appendix to dan web 2002: Database summary 13 Appendix to dan web 2011: Database summary 14 Appendix to dan mixed 2012: Database summary 14 Content details 15 Appendix to dan news 2007: Size of different TLDs 15 Appendix to dan news 2008: Size of different TLDs 15 Appendix to dan news 2010: Size of different TLDs 16 Appendix to dan news 2011: Size of different TLDs 16 Appendix to dan newscrawl 2011: Size of different TLDs 17 Appendix to dan web 2002: Size of different TLDs 17 Appendix to dan web 2011: Size of different TLDs 17 Appendix to dan mixed 2012: Size of different TLDs 18 Appendix to dan news 2007: Size of largest domains 18 Appendix to dan news 2008: Size of largest domains 19 Appendix to dan news 2010: Size of largest domains 19 Appendix to dan news 2011: Size of largest domains 20 Appendix to dan newscrawl 2011: Size of largest domains 21 Appendix to dan web 2002: Size of largest domains 21
4 Appendix to dan web 2011: Size of largest domains 22 Appendix to dan mixed 2012: Size of largest domains 22 Appendix to dan news 2007: Number of sources by time period 23 Appendix to dan news 2008: Number of sources by time period 24 Appendix to dan news 2010: Number of sources by time period 25 Appendix to dan news 2011: Number of sources by time period 26 Word details 28 Appendix to dan news 2007: Words by length without multiplicity 28 Appendix to dan news 2008: Words by length without multiplicity 30 Appendix to dan news 2010: Words by length without multiplicity 32 Appendix to dan news 2011: Words by length without multiplicity 34 Appendix to dan newscrawl 2011: Words by length without multiplicity 36 Appendix to dan wikipedia 2007: Words by length without multiplicity 38 Appendix to dan wikipedia 2012: Words by length without multiplicity 40 Appendix to dan web 2002: Words by length without multiplicity 42 Appendix to dan web 2011: Words by length without multiplicity 44 Appendix to dan mixed 2012: Words by length without multiplicity 46 Appendix to dan news 2007: Words by length with multiplicity 48 Appendix to dan news 2008: Words by length with multiplicity 50 Appendix to dan news 2010: Words by length with multiplicity 52 Appendix to dan news 2011: Words by length with multiplicity 54 Appendix to dan newscrawl 2011: Words by length with multiplicity 56 Appendix to dan wikipedia 2007: Words by length with multiplicity 58 Appendix to dan wikipedia 2012: Words by length with multiplicity 60 Appendix to dan web 2002: Words by length with multiplicity 62 Appendix to dan web 2011: Words by length with multiplicity 64 Appendix to dan mixed 2012: Words by length with multiplicity 66 Appendix to dan news 2007: The most frequent 50 words 67 Appendix to dan news 2008: The most frequent 50 words 68 Appendix to dan news 2010: The most frequent 50 words 69 Appendix to dan news 2011: The most frequent 50 words 70 Appendix to dan newscrawl 2011: The most frequent 50 words 71 Appendix to dan wikipedia 2007: The most frequent 50 words 72 Appendix to dan wikipedia 2012: The most frequent 50 words 73 Appendix to dan web 2002: The most frequent 50 words 74 Appendix to dan web 2011: The most frequent 50 words 75 Appendix to dan mixed 2012: The most frequent 50 words 76
5 Appendix to dan news 2007: Longest words in top by rank 77 Appendix to dan news 2008: Longest words in top by rank 78 Appendix to dan news 2010: Longest words in top by rank 79 Appendix to dan news 2011: Longest words in top by rank 80 Appendix to dan newscrawl 2011: Longest words in top by rank 81 Appendix to dan wikipedia 2007: Longest words in top by rank 82 Appendix to dan wikipedia 2012: Longest words in top by rank 83 Appendix to dan web 2002: Longest words in top by rank 84 Appendix to dan web 2011: Longest words in top by rank 85 Appendix to dan mixed 2012: Longest words in top by rank 86 Character details 87 Appendix to dan news 2007: Alphabet as used in the top words 87 Appendix to dan news 2008: Alphabet as used in the top words 88 Appendix to dan news 2010: Alphabet as used in the top words 90 Appendix to dan news 2011: Alphabet as used in the top words 91 Appendix to dan newscrawl 2011: Alphabet as used in the top words 92 Appendix to dan wikipedia 2007: Alphabet as used in the top words 94 Appendix to dan wikipedia 2012: Alphabet as used in the top words 95 Appendix to dan web 2002: Alphabet as used in the top words 96 Appendix to dan web 2011: Alphabet as used in the top words 98 Appendix to dan mixed 2012: Alphabet as used in the top words 99 Abbreviation details 101 Appendix to dan news 2007: Most frequent abbreviations 101 Appendix to dan news 2008: Most frequent abbreviations 102 Appendix to dan news 2010: Most frequent abbreviations 103 Appendix to dan news 2011: Most frequent abbreviations 104 Appendix to dan newscrawl 2011: Most frequent abbreviations 105 Appendix to dan wikipedia 2007: Most frequent abbreviations 106 Appendix to dan wikipedia 2012: Most frequent abbreviations 107 Appendix to dan web 2002: Most frequent abbreviations 108 Appendix to dan web 2011: Most frequent abbreviations 109 Appendix to dan mixed 2012: Most frequent abbreviations 110 Appendix to dan news 2007: Left neighbors of the full stop 111 Appendix to dan news 2008: Left neighbors of the full stop 112 Appendix to dan news 2010: Left neighbors of the full stop 113 Appendix to dan news 2011: Left neighbors of the full stop 114
6 Appendix to dan newscrawl 2011: Left neighbors of the full stop 115 Appendix to dan wikipedia 2007: Left neighbors of the full stop 116 Appendix to dan wikipedia 2012: Left neighbors of the full stop 117 Appendix to dan web 2002: Left neighbors of the full stop 118 Appendix to dan web 2011: Left neighbors of the full stop 119 Appendix to dan mixed 2012: Left neighbors of the full stop 120 Appendix to dan news 2007: Left neighbors of the full stop with additional internal full stops 121 Appendix to dan news 2008: Left neighbors of the full stop with additional internal full stops 122 Appendix to dan news 2010: Left neighbors of the full stop with additional internal full stops 123 Appendix to dan news 2011: Left neighbors of the full stop with additional internal full stops 124 Appendix to dan newscrawl 2011: Left neighbors of the full stop with additional internal full stops 125 Appendix to dan wikipedia 2007: Left neighbors of the full stop with additional internal full stops 126 Appendix to dan wikipedia 2012: Left neighbors of the full stop with additional internal full stops 127 Appendix to dan web 2002: Left neighbors of the full stop with additional internal full stops 128 Appendix to dan web 2011: Left neighbors of the full stop with additional internal full stops 129 Appendix to dan mixed 2012: Left neighbors of the full stop with additional internal full stops 130 Sentences details 131 Appendix to dan news 2007: Shortest sentences 131 Appendix to dan news 2008: Shortest sentences 132 Appendix to dan news 2010: Shortest sentences 134 Appendix to dan news 2011: Shortest sentences 135 Appendix to dan newscrawl 2011: Shortest sentences 137 Appendix to dan wikipedia 2007: Shortest sentences 138 Appendix to dan wikipedia 2012: Shortest sentences 140 Appendix to dan web 2002: Shortest sentences 141 Appendix to dan web 2011: Shortest sentences 143 Appendix to dan mixed 2012: Shortest sentences 144 Appendix to dan news 2007: Longest sentences 146 Appendix to dan news 2008: Longest sentences 148 Appendix to dan news 2010: Longest sentences 150 Appendix to dan news 2011: Longest sentences 152 Appendix to dan newscrawl 2011: Longest sentences 154 Appendix to dan wikipedia 2007: Longest sentences 156 Appendix to dan wikipedia 2012: Longest sentences 158 Appendix to dan web 2002: Longest sentences 160 Appendix to dan web 2011: Longest sentences 162 Appendix to dan mixed 2012: Longest sentences 164
7 Appendix to dan news 2007: Length of sentences in characters 166 Appendix to dan news 2008: Length of sentences in characters 167 Appendix to dan news 2010: Length of sentences in characters 168 Appendix to dan news 2011: Length of sentences in characters 169 Appendix to dan newscrawl 2011: Length of sentences in characters 170 Appendix to dan wikipedia 2007: Length of sentences in characters 171 Appendix to dan wikipedia 2012: Length of sentences in characters 172 Appendix to dan web 2002: Length of sentences in characters 173 Appendix to dan web 2011: Length of sentences in characters 174 Appendix to dan mixed 2012: Length of sentences in characters 175 Appendix to dan news 2007: Length of sentences in words 176 Appendix to dan news 2008: Length of sentences in words 177 Appendix to dan news 2010: Length of sentences in words 178 Appendix to dan news 2011: Length of sentences in words 179 Appendix to dan newscrawl 2011: Length of sentences in words 180 Appendix to dan wikipedia 2007: Length of sentences in words 181 Appendix to dan wikipedia 2012: Length of sentences in words 182 Appendix to dan web 2002: Length of sentences in words 183 Appendix to dan web 2011: Length of sentences in words 184 Appendix to dan mixed 2012: Length of sentences in words 185 Oddities details 186 Appendix to dan news 2007: Longest words 186 Appendix to dan news 2008: Longest words 186 Appendix to dan news 2010: Longest words 187 Appendix to dan news 2011: Longest words 187 Appendix to dan newscrawl 2011: Longest words 188 Appendix to dan wikipedia 2007: Longest words 188 Appendix to dan wikipedia 2012: Longest words 189 Appendix to dan web 2002: Longest words 189 Appendix to dan web 2011: Longest words 190 Appendix to dan mixed 2012: Longest words 190 Appendix to dan news 2007: Sentences with high average word length 191 Appendix to dan news 2008: Sentences with high average word length 192 Appendix to dan news 2010: Sentences with high average word length 193 Appendix to dan news 2011: Sentences with high average word length 194 Appendix to dan newscrawl 2011: Sentences with high average word length 195 Appendix to dan wikipedia 2012: Sentences with high average word length 196
8 Appendix to dan news 2007: Problems with sentence segmentation - words ending in a stopword 197 Appendix to dan news 2008: Problems with sentence segmentation - words ending in a stopword 197 Appendix to dan news 2010: Problems with sentence segmentation - words ending in a stopword 198 Appendix to dan news 2011: Problems with sentence segmentation - words ending in a stopword 199 Appendix to dan newscrawl 2011: Problems with sentence segmentation - words ending in a stopword 200 Appendix to dan wikipedia 2007: Problems with sentence segmentation - words ending in a stopword 201 Appendix to dan wikipedia 2012: Problems with sentence segmentation - words ending in a stopword 201 Appendix to dan web 2002: Problems with sentence segmentation - words ending in a stopword 202 Appendix to dan web 2011: Problems with sentence segmentation - words ending in a stopword 203 Appendix to dan mixed 2012: Problems with sentence segmentation - words ending in a stopword 204
9 1 Danish corpora Introduction to corpus creation The Leipzig Corpora Collection (LCC) collects Web based corpora for many different languages. The main text genres are newspaper texts, Wikipedias and randomly collected web pages. All corpora are processed in the same way: Crawling Web pages HTML stripping Language identifikation Sentence segmentation Cleaning: Removal of ill-formed sentences Duplicate removal Calculation of word frequences and word co-occurrences As result we have a corpus containing only well-formed sentences in the language under consideration. The sentences are in random order; hence, sharing the corpus does not violate copyright law because it is impossible to reconstruct the original texts. The pre-processing steps contain both language independent steps (like HTML stripping and duplicate removal) and language dependent steps (like language identification and sentence segmentation). Especially the language specific parts are vulnerable to specific processing problems. The aim of the paper is to identify possible problems and evaluate the results. The following problems are adressed: A processing-focused language description Language size: How much text is available for this language? What are the biggest sources? Corpus description: Genre, size, crawling and processing date. Possible problems in language identification: Which languages are similar? Character set and alphabet Inspecting the word list: Most frequent words, longer high frequent words and longest words at all. Word length distribution. Can abbreviations confuse sentence segmentation? Information about the abbreviation list. Inspecting sentences: Inspect shortest and longest sentences to identify possible segmentation problems. Sentence length distribution. The paper describes the result of these inspections; the appendices show the exact results for the different corpora. This helps to compare the corpora with respect to quality. In the section quality overview, an overall quality description for each corpus is given. All corpora contain only minor problems which are irrelevant for most applications. Otherwise the corpus creation has been iterated.
10 DAN - a processing related language description 2 DAN - a processing related language description General properties of the Danish language Native Name: Dansk Classifiation: Indo-European, Germanic, North, East Scandinavian, Danish-Swedish, Danish-Riksmal, Danish Total Number of Speakers: 5.6M Largest countries with number of spakers: Denmark (5.6M) Source: / www. dst. dk/ en/ Statistik/ emner/ befolkning-og-befolkningsfremskrivning/ folketal. aspx Processing summary latin alphabet with some additional characters full stop is used as sentence boundary and for abbreviations apostrostophes used rarely Properties important for processing Alphabet and punctuation The alphabet is latin based, with the following specialities (sources: / en. wikipedia. org/ wiki/ Alphabets_derived_from_the_Latin and / en. wikipedia. org/ wiki/ Danish_and_Norwegian_alphabet): Danish includes all 26 base letters and Æ, Ø, Å Additional letter forms: É (a diacritic used for disamgiguation: en/et - én/ét) In foreign words: Á, À, Â, Ä, É, È, Ê, Ë, Í, Ì, Î, Ï, Ó, Ò, Ô, Ö, Ú, Ù, Û, Ü and more Additional digraphs: EE in foreign words (trainee, frisbee); AA in older texts (replaced by å in 1948) and names (Aalborg, Aarhus). NB! Aa is treetet like Å in alphabetical sorting in danish words only, meaning that Aabenraa is listet under Å (last letter of the alphabet) and Aachen under A. Å, Æ and Ø might occur as AA, AE and OE in newer texts (avoidance of language specific letters) Usual latin punctuation Usage of uppercase letters: At sentence beginnings and for proper names (of persons, organisations, countries etc.). When a word beginning with Aa is capitalized, only the first letter becomes capital, e.g. Aarhus. Sentence segmentation and word tokenization Sentence beginnings Sentences begin with a capitalized first word. Abbreviations Abbreviations confusing with sentence boundaries: Special abbreviation list has to be inspected. Sources for abbreviations: / www. dsn. dk/ retskrivning/ retskrivningsregler/ a / a / a7-42 and / www. dsn. dk/ sprogviden/ udgivelser/ sprognaevnets-skriftserie-1/ flere-udgivelser/ Rigtigt%20kort%20indskannet. pdf/ at_download/ file Abbreviations with full stop may appear in the word list without full stop. Apostrophes ( www. dsn. dk/ retskrivning/ retskrivningsregler/ a7-1-6/ a7-6)
11 DAN - a processing related language description 3 Use of apostrophes: infrequent. in elliptical forms like "bli'", "hva'", "ha'", "ka'" and "la'r" instead of "blive", "hvad", "have", "kan" and "lader" (Bitte überprüfen, warum nach "ha'" immer ein ";" steht, dies passt nicht) to mark combination of a word/radical and inflectional endings: in combination of definite article: euro'en, PC'en, SMS'erne, OP'ens, CD-ROM'en used to mark genitive (instead of "s") in words that end with the letters s, z or x: Marx's ven Wilhelm Liebknecht, Georg Brandes' Plads to mark a genitive or plural form with "s": Jan's, foto's (both incorrect but frequent usages), and, in certain cases, other inflectional endings on proper names: Albert'er (2x Albert), Alberte'r (2x Alberte), Borges'ske dimensioner, Crohn's sygdom in combination with english (or other foreign) words: chicken satay's, Google's brugsoplevelser to mark combination of numerals and inflectional endings: 60'er-rock to mark combination of foreign words ending on "-ee" and inflectional endings: frisbee'en, yankee'er Mainly used to mark citations Sources and ranking (2012) Estimated number of webpages containing text Google.com top-5 words: results for "i" "og" "at" "er" "på" Google.com top-10 words: results for "i" "og" "at" "er" "på" "til" "en" "af" "for" "med" Rank according to number of speakers (Ethnologue): 111 Rank according to Wikipedia size (see / de. wikipedia. org/ wiki/ Wikipedia:Sprachen): Rank 30 with articles. Rank according to number of newspapers as found by AbyZ (5/2012): 160 newspapers, rank 15. Rank according to number of newspapers with RSS feeds (5/2012): 110 newspapers, rank 14. Rank according to our corpus size (9/2012): 19
12 DAN corpora 4 DAN corpora Quality Overview Quality Ratings A: Very good quality. Ready to use (or already used) for frequency dictionary. Size as large as possible Only minimal errors Multiple genres (if possible) A-: Small problems identified. They should not affect usage. B: Native speaker quality. Information about abbreviations and sentence boundaries by native speaker Resulting statistics checked by native speaker, possible errors corrected C: Non-native speaker quality Obvious problems shown in corpus statistics are corrected D: First version Pre-processing with default abbreviation list and default sentence boundaries E: Poor Quality: Old, outdated or faulty. Corpus Quality The quality of the corpora differes slightly because the corpus processing toolchain changed slightly during several years. Moreover, original data are often no more available. Hence, improvement of quality often means removing incomplete or doubtful sentences. Forthcoming editions of all corpora thus might have a slightly smaller number of sentences. This especially applies to near duplicate sentences which are removed only sparingly. The following table shows the quality of the corpora. Minimal errors are still possible and described in the sections below. All possible major improvements are mentioned here. Corpus Quality rating Known problems to-dos dan_news_2007 A- near duplicates, see sentence length distibution - dan_news_2008 A - - dan_news_2010 A - - dan_news_2011 A - - dan_newscrawl_2011 A - - dan_wikipedia_2007 A- near duplicates, see sentence length distibution - dan_wikipedia_2012 A - - dan_web_2002 A - - dan_web_2011 A - - dan_mixed_2012 A - -
13 DAN corpora 5 Processing Overview For more details, see Appendix: Database Summary and Appendix: Number of sources by time period. Corpus Size (M sentences) Size (M running words) Multiwords Crawling date Production date dan_news_ / dan_news_ / dan_news_ / dan_news_ dayly dan_newscrawl_ batch crawling 2012 dan_wikipedia_ dump dan_wikipedia_ dump dan_web_ randomly dan_web_ randomly dan_mixed_ Content Overview For more details, see Appendix: Size of different TLDs and Appendix: Size of different domains. Corpus Type of sources Countries Number of sources Publishing date Biggest source dan_news_2007 News dk 42 newspapers dan_news_2008 News dk 56 newspapers dan_news_2010 News dk 45 newspapers dan_news_2011 News dk 36 newspapers dan_newscrawl_2011 News dk 73 newspapers 2011 and before dan_wikipedia_2007 Wikipedia dan_wikipedia_2012 Wikipedia dan_web_2002 Web dk domains 2002 and before dan_web_2011 Web dk domains 2011 and before aarhus.lokalavisen.dk/ dan_mixed_2012 combined combined domains 2011 and before Words Appendix: Words by Length without multiplicity and Appendix: Words by Length without multiplicity show the length distribution for words. The curves should be smooth and decreasing for length>=5. Appendix: The Most Frequent 50 Words shows the most frequent stopwords as well as one or more words related to the region. Appendix: Longest Words in Top-1000 by rank shows the 25 longest words within the top They usually give an impression of the main topics treated in the corpus. Appendix: Longest Words with minimum frequency 2 should give an idea of very long words. In the case of processing problems, different types of non-words may appear. This might help to improve the word definition.
14 DAN corpora 6 Corpus Word length graph without multiplicity Word length graph with multiplicity Most Frequent 50 Words Longest Words in Top-1000 Longest Words with minimum frequency 2 dan_news_2007 okay okay, min. avg okay okay URLs, routes dan_news_2008 okay okay okay okay okay dan_news_2010 okay okay okay okay URLs, routes dan_news_2011 okay okay okay okay missing blanks, hex strings dan_newscrawl_2011 okay okay Publiceret and.. okay URLs, routes, missing blanks dan_wikipedia_2007 okay, min. avg okay, max. avg okay okay URLs, routes dan_wikipedia_2012 okay okay okay okay URLs dan_web_2002 okay, max. avg okay okay okay missing blanks, special characters dan_web_2011 okay okay okay okay missing blanks, special characters dan_mixed_2012 okay okay okay okay all errors as above Remarks The average word length (without multiplicity) differs for the different text genres. There is an unexpected minimum in the length distribution (with multiplicity) for length 4. Abbreviations For sentence boundary detection, abbreviations ending in a full stop are of interest: Such abbreviations are usually not used as sentence boundaries. Conversely, missing abbreviations can overgenerate sentence boundaries. The list of abbreviations is of high quality: nearly complete and manually checked. Due to limitations in the processing chain, this list of abbreviations is only used for sentence boundary detection and not included in the word list. Hence, abbreviations ending with a full stop appear in the word list without the full stop. Sentences Appendix: Shortest sentences shows the shortest declarative, exclamatory and interrogative sentences. In preprocessing, a minimal length for sentences might be specified. And missing abbreviations are often visible as faulty sentence endings. Appendix: Longest sentences shows the longest declarative, exclamatory and interrogative sentences. Usually, the maximun sentence length is defined as 256 characters (not 256 bytes). Very long exclamatory or interrogative sentences often contain an overseen sentence boundary. Appendix: Length of sentences in characters shows the distribution of the sentence length. A large and balanced corpus will result in a smooth and bell-shaped curve. Isolated local maxima usually result from large sets of near duplicate sentences.
15 DAN corpora 7 Corpus Shortest sentences Longest sentences Length distribution (in characters) dan_news_2007 unsymmetric quotation marks okay near duplicate peak at 48 dan_news_2008 some unsymmetric quotation marks okay sentences longer than 255? dan_news_2010 okay 1 menu list, 2x hex data near duplicate peak at 42? Length distribution (in words) okay okay okay dan_news_2011 duplicate sentences declarative sentences with many time data near duplicate peak at 42 okay dan_newscrawl_2011 okay declarative sentences with many time data many near duplicate peaks many near duplicate peaks dan_wikipedia_2007 declarative sentences beginning with digits and ending with abbrev. okay near duplicate peak at 20 sharp maximum at 10 dan_wikipedia_2012 okay okay okay okay dan_web_2002 dan_web_2011 dan_mixed_2012 declarative non-sentences, interrogative sentences beginning lowercase or with blank Lowercase beginnings for declarative sentences okay very smooth okay Enumerations, multiple sentences max. 277 characters Oddities Appendix: Sentences with high average word length: Average sentences contain many stopwords, and these stopwords are usually short. Hence, they restrict the average word length in a sentence. Conversely, sentences with high average word length are often ill formed. They may be used to improve pre-processing. Appendix: Problems with sentence segmentation - Words ending in a stopword: If there are many ill-formed word or sentence boundaries witout a blank between two words, they will generate new ill-formed words. The appendix shows the most frequent words ending in an uppercase stopword. If they are infrequent then the date were of high quality. Corpus Sentences with high average word length Words ending in a stopword... dan_news_2007 all kinds of errors okay dan_news_2008 okay okay dan_news_2010 2x hex strings maxfreq=11 dan_news_2011 2x hex strings, 2x missing blanks maxfreq=24 dan_newscrawl_2011 1x missing blanks maxfreq=805 dan_wikipedia_2007 (no data) okay dan_wikipedia_2012 okay maxfreq=27 dan_web_2002 missing blanks, underscores maxfreq=67 dan_web_2011 missing blanks maxfreq=58 dan_mixed_2012 missing blanks, underscores words containing ";"
16 DAN corpus comparison 8 DAN corpus comparison Automated Corpus comparison For the conducted comparisons, the following tests on the top-1000 words are performed: Vectors based on the frequencies of the top-1000 words are created for the analysed languages. As similarity value, 1-cos(alpha) of the angle alpha between these vectors is computed. Identical languages receive a value of 0, distinct languages get a value of 1. The same analysis is conducted using the frequencies of the top-1000 typical letter trigrams of the languages. Monolingual word list comparison (top-1000 words) As one can expect the comparisons show: The different news corpora have word lists with maximum distance 0.19 (dan_newscrawl_2011 and dan_news_2008) The web corpora have word lists with distance 0.13 The wikipedia corpora are similar with distance 0.10 The biggest distance of 0.36 can be found between dan_wikipedia_2007 dan_news_2008 The mixed corpus dan_mixed_2012 has a central position within the corpora and has a maximum distance of 0.31 to the wikipedia_2007 corpus Multilingual word list comparison (top-1000 words) Both the comparison of the top-1000 words and the comparison of the letter trigrams used in these words were conducted to find the most similar languages based on these features. The distance of Danish to the next languages considering words is 0.47 to Swedish. Considering letter trigrams the nearest language with distance 0.38 is Bokmål. These distances are below average. On average the value for the most similar language to a language in question is 0.58 for trigrams. The most similar languages based on words: Swedish, Bokmål, Nynorsk source language_short_name language_name cos_logfreq dan swe Swedish dan nob Norwegian, Bokmål dan nno Norwegian, Nynorsk dan fao Faroese dan isl Icelandic The most similar languages based on letter trigrams: Bokmål, Swedish, Dutch source language_short_name language_name cos_logfreq dan nob Norwegian, Bokmål dan swe Swedish dan nld Dutch dan nno Norwegian, Nynorsk
17 DAN corpus comparison 9 dan deu German
18 10 Processing details Appendix to dan news 2007: Database summary Values for some general parameters Parameter Value Number of sentences Number of running word forms Number of distinct word forms Number of multiwords 0 Percentage of words with frequency= Number of sentence based co-occurrences Number of neighbour co-occurrences Appendix to dan news 2008: Database summary Values for some general parameters Parameter Value Number of sentences Number of running word forms Number of distinct word forms Number of multiwords Percentage of words with frequency= Number of sentence based co-occurrences Number of neighbour co-occurrences
19 Appendix to dan news 2010: Database summary 11 Appendix to dan news 2010: Database summary Values for some general parameters Parameter Value Number of sentences Number of running word forms Number of distinct word forms Number of multiwords 9696 Percentage of words with frequency= Number of sentence based co-occurrences Number of neighbour co-occurrences Appendix to dan news 2011: Database summary Values for some general parameters Parameter Value Number of sentences Number of running word forms Number of distinct word forms Number of multiwords Percentage of words with frequency= Number of sentence based co-occurrences Number of neighbour co-occurrences
20 Appendix to dan newscrawl 2011: Database summary 12 Appendix to dan newscrawl 2011: Database summary Values for some general parameters Parameter Value Number of sentences Number of running word forms Number of distinct word forms Number of multiwords Percentage of words with frequency= Number of sentence based co-occurrences Number of neighbour co-occurrences Appendix to dan wikipedia 2007: Database summary Values for some general parameters Parameter Value Number of sentences Number of running word forms Number of distinct word forms Number of multiwords Percentage of words with frequency= Number of sentence based co-occurrences Number of neighbour co-occurrences
Unitel EDI MT940 June 2010. Based on: SWIFT Standards - Category 9 MT940 Customer Statement Message (January 2004)
Unitel EDI MT940 June 2010 Based on: SWIFT Standards - Category 9 MT940 Customer Statement Message (January 2004) Contents 1. Introduction...3 2. General...3 3. Description of the MT940 message...3 3.1.
Læs mereHelp / Hjælp
Home page Lisa & Petur www.lisapetur.dk Help / Hjælp Help / Hjælp General The purpose of our Homepage is to allow external access to pictures and videos taken/made by the Gunnarsson family. The Association
Læs merePrivat-, statslig- eller regional institution m.v. Andet Added Bekaempelsesudfoerende: string No Label: Bekæmpelsesudførende
Changes for Rottedatabasen Web Service The coming version of Rottedatabasen Web Service will have several changes some of them breaking for the exposed methods. These changes and the business logic behind
Læs mereFinancial Literacy among 5-7 years old children
Financial Literacy among 5-7 years old children -based on a market research survey among the parents in Denmark, Sweden, Norway, Finland, Northern Ireland and Republic of Ireland Page 1 Purpose of the
Læs mereGeneralized Probit Model in Design of Dose Finding Experiments. Yuehui Wu Valerii V. Fedorov RSU, GlaxoSmithKline, US
Generalized Probit Model in Design of Dose Finding Experiments Yuehui Wu Valerii V. Fedorov RSU, GlaxoSmithKline, US Outline Motivation Generalized probit model Utility function Locally optimal designs
Læs mereSkriftlig Eksamen Beregnelighed (DM517)
Skriftlig Eksamen Beregnelighed (DM517) Institut for Matematik & Datalogi Syddansk Universitet Mandag den 31 Oktober 2011, kl. 9 13 Alle sædvanlige hjælpemidler (lærebøger, notater etc.) samt brug af lommeregner
Læs mereThe X Factor. Målgruppe. Læringsmål. Introduktion til læreren klasse & ungdomsuddannelser Engelskundervisningen
The X Factor Målgruppe 7-10 klasse & ungdomsuddannelser Engelskundervisningen Læringsmål Eleven kan give sammenhængende fremstillinger på basis af indhentede informationer Eleven har viden om at søge og
Læs mereEngelsk. Niveau D. De Merkantile Erhvervsuddannelser September Casebaseret eksamen. og
052431_EngelskD 08/09/05 13:29 Side 1 De Merkantile Erhvervsuddannelser September 2005 Side 1 af 4 sider Casebaseret eksamen Engelsk Niveau D www.jysk.dk og www.jysk.com Indhold: Opgave 1 Presentation
Læs mereCHAPTER 8: USING OBJECTS
Ruby: Philosophy & Implementation CHAPTER 8: USING OBJECTS Introduction to Computer Science Using Ruby Ruby is the latest in the family of Object Oriented Programming Languages As such, its designer studied
Læs mereVores mange brugere på musskema.dk er rigtig gode til at komme med kvalificerede ønsker og behov.
På dansk/in Danish: Aarhus d. 10. januar 2013/ the 10 th of January 2013 Kære alle Chefer i MUS-regi! Vores mange brugere på musskema.dk er rigtig gode til at komme med kvalificerede ønsker og behov. Og
Læs mereUser Manual for LTC IGNOU
User Manual for LTC IGNOU 1 LTC (Leave Travel Concession) Navigation: Portal Launch HCM Application Self Service LTC Self Service 1. LTC Advance/Intimation Navigation: Launch HCM Application Self Service
Læs merePortal Registration. Check Junk Mail for activation . 1 Click the hyperlink to take you back to the portal to confirm your registration
Portal Registration Step 1 Provide the necessary information to create your user. Note: First Name, Last Name and Email have to match exactly to your profile in the Membership system. Step 2 Click on the
Læs mereEngelsk. Niveau C. De Merkantile Erhvervsuddannelser September 2005. Casebaseret eksamen. www.jysk.dk og www.jysk.com.
052430_EngelskC 08/09/05 13:29 Side 1 De Merkantile Erhvervsuddannelser September 2005 Side 1 af 4 sider Casebaseret eksamen Engelsk Niveau C www.jysk.dk og www.jysk.com Indhold: Opgave 1 Presentation
Læs mereVina Nguyen HSSP July 13, 2008
Vina Nguyen HSSP July 13, 2008 1 What does it mean if sets A, B, C are a partition of set D? 2 How do you calculate P(A B) using the formula for conditional probability? 3 What is the difference between
Læs mereAktivering af Survey funktionalitet
Surveys i REDCap REDCap gør det muligt at eksponere ét eller flere instrumenter som et survey (spørgeskema) som derefter kan udfyldes direkte af patienten eller forsøgspersonen over internettet. Dette
Læs mereOn the complexity of drawing trees nicely: corrigendum
Acta Informatica 40, 603 607 (2004) Digital Object Identifier (DOI) 10.1007/s00236-004-0138-y On the complexity of drawing trees nicely: corrigendum Thorsten Akkerman, Christoph Buchheim, Michael Jünger,
Læs merePARALLELIZATION OF ATTILA SIMULATOR WITH OPENMP MIGUEL ÁNGEL MARTÍNEZ DEL AMOR MINIPROJECT OF TDT24 NTNU
PARALLELIZATION OF ATTILA SIMULATOR WITH OPENMP MIGUEL ÁNGEL MARTÍNEZ DEL AMOR MINIPROJECT OF TDT24 NTNU OUTLINE INEFFICIENCY OF ATTILA WAYS TO PARALLELIZE LOW COMPATIBILITY IN THE COMPILATION A SOLUTION
Læs mereWebsite review groweasy.dk
Website review groweasy.dk Generated on September 01 2016 10:32 AM The score is 56/100 SEO Content Title Webbureau Odense GrowEasy hjælper dig med digital markedsføring! Length : 66 Perfect, your title
Læs mereTo the reader: Information regarding this document
To the reader: Information regarding this document All text to be shown to respondents in this study is going to be in Danish. The Danish version of the text (the one, respondents are going to see) appears
Læs mereStatistical information form the Danish EPC database - use for the building stock model in Denmark
Statistical information form the Danish EPC database - use for the building stock model in Denmark Kim B. Wittchen Danish Building Research Institute, SBi AALBORG UNIVERSITY Certification of buildings
Læs mereNyhedsmail, december 2013 (scroll down for English version)
Nyhedsmail, december 2013 (scroll down for English version) Kære Omdeler Julen venter rundt om hjørnet. Og netop julen er årsagen til, at NORDJYSKE Distributions mange omdelere har ekstra travlt med at
Læs mereLearnings from the implementation of Epic
Learnings from the implementation of Epic Appendix Picture from Region H (2016) A thesis report by: Oliver Metcalf-Rinaldo, oliv@itu.dk Stephan Mosko Jensen, smos@itu.dk Appendix - Table of content Appendix
Læs mereBilag. Resume. Side 1 af 12
Bilag Resume I denne opgave, lægges der fokus på unge og ensomhed gennem sociale medier. Vi har i denne opgave valgt at benytte Facebook som det sociale medie vi ligger fokus på, da det er det største
Læs mereGUIDE TIL BREVSKRIVNING
GUIDE TIL BREVSKRIVNING APPELBREVE Formålet med at skrive et appelbrev er at få modtageren til at overholde menneskerettighederne. Det er en god idé at lægge vægt på modtagerens forpligtelser over for
Læs mereBasic statistics for experimental medical researchers
Basic statistics for experimental medical researchers Sample size calculations September 15th 2016 Christian Pipper Department of public health (IFSV) Faculty of Health and Medicinal Science (SUND) E-mail:
Læs mereHvor er mine runde hjørner?
Hvor er mine runde hjørner? Ofte møder vi fortvivlelse blandt kunder, når de ser deres nye flotte site i deres browser og indser, at det ser anderledes ud, i forhold til det design, de godkendte i starten
Læs mereTrolling Master Bornholm 2014
Trolling Master Bornholm 2014 (English version further down) Den ny havn i Tejn Havn Bornholms Regionskommune er gået i gang med at udvide Tejn Havn, og det er med til at gøre det muligt, at vi kan være
Læs mereTrolling Master Bornholm 2012
Trolling Master Bornholm 1 (English version further down) Tak for denne gang Det var en fornøjelse især jo også fordi vejret var med os. Så heldig har vi aldrig været før. Vi skal evaluere 1, og I må meget
Læs mereSport for the elderly
Sport for the elderly - Teenagers of the future Play the Game 2013 Aarhus, 29 October 2013 Ditte Toft Danish Institute for Sports Studies +45 3266 1037 ditte.toft@idan.dk A growing group in the population
Læs mereDoodleBUGS (Hands-on)
DoodleBUGS (Hands-on) Simple example: Program: bino_ave_sim_doodle.odc A simulation example Generate a sample from F=(r1+r2)/2 where r1~bin(0.5,200) and r2~bin(0.25,100) Note that E(F)=(100+25)/2=62.5
Læs mereSkriftlig Eksamen Beregnelighed (DM517)
Skriftlig Eksamen Beregnelighed (DM517) Institut for Matematik & Datalogi Syddansk Universitet Mandag den 7 Januar 2008, kl. 9 13 Alle sædvanlige hjælpemidler (lærebøger, notater etc.) samt brug af lommeregner
Læs mereSAS Corporate Program Website
SAS Corporate Program Website Dear user We have developed SAS Corporate Program Website to make the administration of your company's travel activities easier. You can read about it in this booklet, which
Læs mereProject Step 7. Behavioral modeling of a dual ported register set. 1/8/ L11 Project Step 5 Copyright Joanne DeGroat, ECE, OSU 1
Project Step 7 Behavioral modeling of a dual ported register set. Copyright 2006 - Joanne DeGroat, ECE, OSU 1 The register set Register set specifications 16 dual ported registers each with 16- bit words
Læs mereSkriftlig Eksamen Kombinatorik, Sandsynlighed og Randomiserede Algoritmer (DM528)
Skriftlig Eksamen Kombinatorik, Sandsynlighed og Randomiserede Algoritmer (DM58) Institut for Matematik og Datalogi Syddansk Universitet, Odense Torsdag den 1. januar 01 kl. 9 13 Alle sædvanlige hjælpemidler
Læs mereSpecial VFR. - ved flyvning til mindre flyveplads uden tårnkontrol som ligger indenfor en kontrolzone
Special VFR - ved flyvning til mindre flyveplads uden tårnkontrol som ligger indenfor en kontrolzone SERA.5005 Visual flight rules (a) Except when operating as a special VFR flight, VFR flights shall be
Læs mereTrolling Master Bornholm 2016 Nyhedsbrev nr. 8
Trolling Master Bornholm 2016 Nyhedsbrev nr. 8 English version further down Der bliver landet fisk men ikke mange Her er det Johnny Nielsen, Søløven, fra Tejn, som i denne uge fangede 13,0 kg nord for
Læs mereBesvarelser til Lineær Algebra Reeksamen Februar 2017
Besvarelser til Lineær Algebra Reeksamen - 7. Februar 207 Mikkel Findinge Bemærk, at der kan være sneget sig fejl ind. Kontakt mig endelig, hvis du skulle falde over en sådan. Dette dokument har udelukkende
Læs mereECE 551: Digital System * Design & Synthesis Lecture Set 5
ECE 551: Digital System * Design & Synthesis Lecture Set 5 5.1: Verilog Behavioral Model for Finite State Machines (FSMs) 5.2: Verilog Simulation I/O and 2001 Standard (In Separate File) 3/4/2003 1 ECE
Læs mereTrolling Master Bornholm 2016 Nyhedsbrev nr. 3
Trolling Master Bornholm 2016 Nyhedsbrev nr. 3 English version further down Den første dag i Bornholmerlaks konkurrencen Formanden for Bornholms Trollingklub, Anders Schou Jensen (og meddomer i TMB) fik
Læs mereTrolling Master Bornholm 2016 Nyhedsbrev nr. 7
Trolling Master Bornholm 2016 Nyhedsbrev nr. 7 English version further down Så var det omsider fiskevejr En af dem, der kom på vandet i en af hullerne, mellem den hårde vestenvind var Lejf K. Pedersen,
Læs mereTrolling Master Bornholm 2014
Trolling Master Bornholm 2014 (English version further down) Ny præmie Trolling Master Bornholm fylder 10 år næste gang. Det betyder, at vi har fundet på en ny og ganske anderledes præmie. Den fisker,
Læs mereBookingmuligheder for professionelle brugere i Dansehallerne 2015-16
Bookingmuligheder for professionelle brugere i Dansehallerne 2015-16 Modtager man økonomisk støtte til et danseprojekt, har en premieredato og er professionel bruger af Dansehallerne har man mulighed for
Læs mereTrolling Master Bornholm 2013
Trolling Master Bornholm 2013 (English version further down) Tilmeldingen åbner om to uger Mandag den 3. december kl. 8.00 åbner tilmeldingen til Trolling Master Bornholm 2013. Vi har flere tilmeldinger
Læs mereStrings and Sets: set complement, union, intersection, etc. set concatenation AB, power of set A n, A, A +
Strings and Sets: A string over Σ is any nite-length sequence of elements of Σ The set of all strings over alphabet Σ is denoted as Σ Operators over set: set complement, union, intersection, etc. set concatenation
Læs mereESG reporting meeting investors needs
ESG reporting meeting investors needs Carina Ohm Nordic Head of Climate Change and Sustainability Services, EY DIRF dagen, 24 September 2019 Investors have growing focus on ESG EY Investor Survey 2018
Læs mereDanish Language Course for International University Students Copenhagen, 12 July 1 August Application form
Danish Language Course for International University Students Copenhagen, 12 July 1 August 2017 Application form Must be completed on the computer in Danish or English All fields are mandatory PERSONLIGE
Læs mereApplications. Computational Linguistics: Jordan Boyd-Graber University of Maryland RL FOR MACHINE TRANSLATION. Slides adapted from Phillip Koehn
Applications Slides adapted from Phillip Koehn Computational Linguistics: Jordan Boyd-Graber University of Maryland RL FOR MACHINE TRANSLATION Computational Linguistics: Jordan Boyd-Graber UMD Applications
Læs mereDendrokronologisk Laboratorium
Dendrokronologisk Laboratorium NNU rapport 14, 2001 ROAGER KIRKE, TØNDER AMT Nationalmuseet og Den Antikvariske Samling i Ribe. Undersøgt af Orla Hylleberg Eriksen. NNU j.nr. A5712 Foto: P. Kristiansen,
Læs mereHeuristics for Improving
Heuristics for Improving Model Learning Based Testing Muhammad Naeem Irfan VASCO-LIG LIG, Computer Science Lab, Grenoble Universities, 38402 Saint Martin d Hères France Introduction Component Based Software
Læs mereDanish Language Course for Foreign University Students Copenhagen, 13 July 2 August 2016 Advanced, medium and beginner s level.
Danish Language Course for Foreign University Students Copenhagen, 13 July 2 August 2016 Advanced, medium and beginner s level Application form Must be completed on the computer in Danish or English All
Læs mereapplies equally to HRT and tibolone this should be made clear by replacing HRT with HRT or tibolone in the tibolone SmPC.
Annex I English wording to be implemented SmPC The texts of the 3 rd revision of the Core SPC for HRT products, as published on the CMD(h) website, should be included in the SmPC. Where a statement in
Læs mereIBM Network Station Manager. esuite 1.5 / NSM Integration. IBM Network Computer Division. tdc - 02/08/99 lotusnsm.prz Page 1
IBM Network Station Manager esuite 1.5 / NSM Integration IBM Network Computer Division tdc - 02/08/99 lotusnsm.prz Page 1 New esuite Settings in NSM The Lotus esuite Workplace administration option is
Læs mereVelkommen til IFF QA erfa møde d. 15. marts Erfaringer med miljømonitorering og tolkning af nyt anneks 1.
Velkommen til IFF QA erfa møde d. 15. marts 2018 Erfaringer med miljømonitorering og tolkning af nyt anneks 1. 1 Fast agenda kl.16.30-18.00 1. Nyt fra kurser, seminarer, myndighedsinspektioner, audit som
Læs mereCentral Statistical Agency.
Central Statistical Agency www.csa.gov.et 1 Outline Introduction Characteristics of Construction Aim of the Survey Methodology Result Conclusion 2 Introduction Meaning of Construction Construction may
Læs mereSubject to terms and conditions. WEEK Type Price EUR WEEK Type Price EUR WEEK Type Price EUR WEEK Type Price EUR
ITSO SERVICE OFFICE Weeks for Sale 31/05/2015 m: +34 636 277 307 w: clublasanta-timeshare.com e: roger@clublasanta.com See colour key sheet news: rogercls.blogspot.com Subject to terms and conditions THURSDAY
Læs mereDK - Quick Text Translation. HEYYER Net Promoter System Magento extension
DK - Quick Text Translation HEYYER Net Promoter System Magento extension Version 1.0 15-11-2013 HEYYER / Email Templates Invitation Email Template Invitation Email English Dansk Title Invitation Email
Læs mereATEX direktivet. Vedligeholdelse af ATEX certifikater mv. Steen Christensen stec@teknologisk.dk www.atexdirektivet.
ATEX direktivet Vedligeholdelse af ATEX certifikater mv. Steen Christensen stec@teknologisk.dk www.atexdirektivet.dk tlf: 7220 2693 Vedligeholdelse af Certifikater / tekniske dossier / overensstemmelseserklæringen.
Læs mereLinear Programming ١ C H A P T E R 2
Linear Programming ١ C H A P T E R 2 Problem Formulation Problem formulation or modeling is the process of translating a verbal statement of a problem into a mathematical statement. The Guidelines of formulation
Læs mereTrolling Master Bornholm 2015
Trolling Master Bornholm 2015 (English version further down) Sæsonen er ved at komme i omdrejninger. Her er det John Eriksen fra Nexø med 95 cm og en kontrolleret vægt på 11,8 kg fanget på østkysten af
Læs mereMandara. PebbleCreek. Tradition Series. 1,884 sq. ft robson.com. Exterior Design A. Exterior Design B.
Mandara 1,884 sq. ft. Tradition Series Exterior Design A Exterior Design B Exterior Design C Exterior Design D 623.935.6700 robson.com Tradition OPTIONS Series Exterior Design A w/opt. Golf Cart Garage
Læs mereStatistik for MPH: 7
Statistik for MPH: 7 3. november 2011 www.biostat.ku.dk/~pka/mph11 Attributable risk, bestemmelse af stikprøvestørrelse (Silva: 333-365, 381-383) Per Kragh Andersen 1 Fra den 6. uges statistikundervisning:
Læs mereSikkerhed & Revision 2013
Sikkerhed & Revision 2013 Samarbejde mellem intern revisor og ekstern revisor - og ISA 610 v/ Dorthe Tolborg Regional Chief Auditor, Codan Group og formand for IIA DK RSA REPRESENTATION WORLD WIDE 300
Læs mereImproving data services by creating a question database. Nanna Floor Clausen Danish Data Archives
Improving data services by creating a question database Nanna Floor Clausen Danish Data Archives Background Pressure on the students Decrease in response rates The users want more Why a question database?
Læs mereThe River Underground, Additional Work
39 (104) The River Underground, Additional Work The River Underground Crosswords Across 1 Another word for "hard to cope with", "unendurable", "insufferable" (10) 5 Another word for "think", "believe",
Læs mereDen nye Eurocode EC Geotenikerdagen Morten S. Rasmussen
Den nye Eurocode EC1997-1 Geotenikerdagen Morten S. Rasmussen UDFORDRINGER VED EC 1997-1 HVAD SKAL VI RUNDE - OPBYGNINGEN AF DE NYE EUROCODES - DE STØRSTE UDFORDRINGER - ER DER NOGET POSITIVT? 2 OPBYGNING
Læs mereBarnets navn: Børnehave: Kommune: Barnets modersmål (kan være mere end et)
Forældreskema Barnets navn: Børnehave: Kommune: Barnets modersmål (kan være mere end et) Barnets alder: år og måneder Barnet begyndte at lære dansk da det var år Søg at besvare disse spørgsmål så godt
Læs mereOverview LINKING METRICS BACKLINKS TYPES. URL Rating Domain Rating Backlinks Referring Domains. Referring Pages 173. text 173. Total Backlinks 184
Overview URL Rating Domain Rating Backlinks Referring Domains 12 35 184 11 0 0 0 0 LINKING METRICS Referring Pages 173 Total Backlinks 184 Crawled Pages 1 Referring IPs 9 Referring Subnets 8 Referring
Læs mereHandout 1: Eksamensspørgsmål
Handout 1: Eksamensspørgsmål Denne vejledning er udfærdiget på grundlag af Peter Bakkers vejledning til jeres eksamensspørgsmål. Hvis der skulle forekomme afvigelser fra Peter Bakkers vejledning, er det
Læs mereCS 4390/5387 SOFTWARE V&V LECTURE 5 BLACK-BOX TESTING - 2
1 CS 4390/5387 SOFTWARE V&V LECTURE 5 BLACK-BOX TESTING - 2 Outline 2 HW Solution Exercise (Equivalence Class Testing) Exercise (Decision Table Testing) Pairwise Testing Exercise (Pairwise Testing) 1 Homework
Læs mereSkriftlig Eksamen Diskret matematik med anvendelser (DM72)
Skriftlig Eksamen Diskret matematik med anvendelser (DM72) Institut for Matematik & Datalogi Syddansk Universitet, Odense Onsdag den 18. januar 2006 Alle sædvanlige hjælpemidler (lærebøger, notater etc.),
Læs mereForslag til implementering af ResearcherID og ORCID på SCIENCE
SCIENCE Forskningsdokumentation Forslag til implementering af ResearcherID og ORCID på SCIENCE SFU 12.03.14 Forslag til implementering af ResearcherID og ORCID på SCIENCE Hvad er WoS s ResearcherID? Hvad
Læs mereSports journalism in the sporting landscape
Sports journalism in the sporting landscape - Blind spots of the journalists Foto: Bjørn Giesenbauer/Flickr Play the Game 2013 Aarhus, 30 October 2013 Ditte Toft Danish Institute for Sports Studies/Play
Læs mereWIKI & Lady Avenue New B2B shop
WIKI & Lady Avenue New B2B shop Login Login: You need a personal username and password Du skal bruge et personligt username og password Only Recommended Retail Prices Viser kun vejl.priser! Bestilling
Læs mereTrolling Master Bornholm 2014
Trolling Master Bornholm 2014 (English version further down) Så er ballet åbnet, 16,64 kg: Det er Kim Christiansen, som i mange år også har deltaget i TMB, der tirsdag landede denne laks. Den måler 120
Læs mereUsing SL-RAT to Reduce SSOs
Using SL-RAT to Reduce SSOs Daniel R. Murphy, P.E. Lindsey L. Donbavand November 17, 2016 Presentation Outline Background Overview of Acoustic Inspection Approach Results Conclusion 2 Background Sanitary
Læs mereThe GAssist Pittsburgh Learning Classifier System. Dr. J. Bacardit, N. Krasnogor G53BIO - Bioinformatics
The GAssist Pittsburgh Learning Classifier System Dr. J. Bacardit, N. Krasnogor G53BIO - Outline bioinformatics Summary and future directions Objectives of GAssist GAssist [Bacardit, 04] is a Pittsburgh
Læs mereExercise 6.14 Linearly independent vectors are also affinely independent.
Affine sets Linear Inequality Systems Definition 6.12 The vectors v 1, v 2,..., v k are affinely independent if v 2 v 1,..., v k v 1 is linearly independent; affinely dependent, otherwise. We first check
Læs mereDendrokronologisk Laboratorium
Dendrokronologisk Laboratorium NNU rapport 8, 2001 BRO OVER SKJERN Å, RINGKØBING AMT Skjern Å Projektet/Oxbøl Statsskovdistrikt/RAS. Indsendt af Torben Egeberg og Mogens Schou Jørgensen. Undersøgt af Aoife
Læs mereStarWars-videointro. Start din video på den nørdede måde! Version: August 2012
StarWars-videointro Start din video på den nørdede måde! Version: August 2012 Indholdsfortegnelse StarWars-effekt til videointro!...4 Hent programmet...4 Indtast din tekst...5 Export til film...6 Avanceret
Læs mereDeveloping a tool for searching and learning. - the potential of an enriched end user thesaurus
Developing a tool for searching and learning - the potential of an enriched end user thesaurus The domain study Focus area The domain of EU EU as a practical oriented domain and not as a scientific domain.
Læs mereAppendix 1: Interview guide Maria og Kristian Lundgaard-Karlshøj, Ausumgaard
Appendix 1: Interview guide Maria og Kristian Lundgaard-Karlshøj, Ausumgaard Fortæl om Ausumgaard s historie Der er hele tiden snak om værdier, men hvad er det for nogle værdier? uddyb forklar definer
Læs mereThe complete construction for copying a segment, AB, is shown above. Describe each stage of the process.
A a compass, a straightedge, a ruler, patty paper B C A Stage 1 Stage 2 B C D Stage 3 The complete construction for copying a segment, AB, is shown above. Describe each stage of the process. Use a ruler
Læs mereTrolling Master Bornholm 2015
Trolling Master Bornholm 2015 (English version further down) Panorama billede fra starten den første dag i 2014 Michael Koldtoft fra Trolling Centrum har brugt lidt tid på at arbejde med billederne fra
Læs mereTrolling Master Bornholm 2013
Trolling Master Bornholm 2013 (English version further down) Tilmeldingerne til 2013 I dag nåede vi op på 85 tilmeldte både. Det er stadig lidt lavere end samme tidspunkt sidste år. Tilmeldingen er åben
Læs mereTM4 Central Station. User Manual / brugervejledning K2070-EU. Tel Fax
TM4 Central Station User Manual / brugervejledning K2070-EU STT Condigi A/S Niels Bohrs Vej 42, Stilling 8660 Skanderborg Denmark Tel. +45 87 93 50 00 Fax. +45 87 93 50 10 info@sttcondigi.com www.sttcondigi.com
Læs mereKvant Eksamen December 2010 3 timer med hjælpemidler. 1 Hvad er en continuous variable? Giv 2 illustrationer.
Kvant Eksamen December 2010 3 timer med hjælpemidler 1 Hvad er en continuous variable? Giv 2 illustrationer. What is a continuous variable? Give two illustrations. 2 Hvorfor kan man bedre drage konklusioner
Læs mereStrategic Capital ApS has requested Danionics A/S to make the following announcement prior to the annual general meeting on 23 April 2013:
Copenhagen, 23 April 2013 Announcement No. 9/2013 Danionics A/S Dr. Tværgade 9, 1. DK 1302 Copenhagen K, Denmark Tel: +45 88 91 98 70 Fax: +45 88 91 98 01 E-mail: investor@danionics.dk Website: www.danionics.dk
Læs mereHow Long Is an Hour? Family Note HOME LINK 8 2
8 2 How Long Is an Hour? The concept of passing time is difficult for young children. Hours, minutes, and seconds are confusing; children usually do not have a good sense of how long each time interval
Læs mereInfo og krav til grupper med motorkøjetøjer
Info og krav til grupper med motorkøjetøjer (English version, see page 4) GENERELT - FOR ALLE TYPER KØRETØJER ØJER GODT MILJØ FOR ALLE Vi ønsker at paraden er en god oplevelse for alle deltagere og tilskuere,
Læs mereBlack Jack --- Review. Spring 2012
Black Jack --- Review Spring 2012 Simulation Simulation can solve real-world problems by modeling realworld processes to provide otherwise unobtainable information. Computer simulation is used to predict
Læs mereBILAG 8.1.B TIL VEDTÆGTER FOR EXHIBIT 8.1.B TO THE ARTICLES OF ASSOCIATION FOR
BILAG 8.1.B TIL VEDTÆGTER FOR ZEALAND PHARMA A/S EXHIBIT 8.1.B TO THE ARTICLES OF ASSOCIATION FOR ZEALAND PHARMA A/S INDHOLDSFORTEGNELSE/TABLE OF CONTENTS 1 FORMÅL... 3 1 PURPOSE... 3 2 TILDELING AF WARRANTS...
Læs mereDesign til digitale kommunikationsplatforme-f2013
E-travellbook Design til digitale kommunikationsplatforme-f2013 ITU 22.05.2013 Dreamers Lana Grunwald - svetlana.grunwald@gmail.com Iya Murash-Millo - iyam@itu.dk Hiwa Mansurbeg - hiwm@itu.dk Jørgen K.
Læs mereVejledning til Sundhedsprocenten og Sundhedstjek
English version below Vejledning til Sundhedsprocenten og Sundhedstjek Udfyld Sundhedsprocenten Sæt mål og lav en handlingsplan Book tid til Sundhedstjek Log ind på www.falckhealthcare.dk/novo Har du problemer
Læs mereMandara. PebbleCreek. Tradition Series. 1,884 sq. ft robson.com. Exterior Design A. Exterior Design B.
Mandara 1,884 sq. ft. Tradition Series Exterior Design A Exterior Design B Exterior Design C Exterior Design D 623.935.6700 robson.com Tradition Series Exterior Design A w/opt. Golf Cart Garage Exterior
Læs mereDigitaliseringsstyrelsen
NemLog-in 29-05-2018 INTERNAL USE Indholdsfortegnelse 1 NEMLOG-IN-LØSNINGER GØRES SIKRERE... 3 1.1 TJENESTEUDBYDERE SKAL FORBEREDE DERES LØSNINGER... 3 1.2 HVIS LØSNINGEN IKKE FORBEREDES... 3 2 VEJLEDNING
Læs mereMSE PRESENTATION 2. Presented by Srunokshi.Kaniyur.Prema. Neelakantan Major Professor Dr. Torben Amtoft
CAPABILITY CONTROL LIST MSE PRESENTATION 2 Presented by Srunokshi.Kaniyur.Prema. Neelakantan Major Professor Dr. Torben Amtoft PRESENTATION OUTLINE Action items from phase 1 presentation tti Architecture
Læs mereKort A. Tidsbegrænset EF/EØS-opholdsbevis (anvendes til EF/EØS-statsborgere) (Card A. Temporary EU/EEA residence permit used for EU/EEA nationals)
DENMARK Residence cards EF/EØS opholdskort (EU/EEA residence card) (title on card) Kort A. Tidsbegrænset EF/EØS-opholdsbevis (anvendes til EF/EØS-statsborgere) (Card A. Temporary EU/EEA residence permit
Læs mereRemember the Ship, Additional Work
51 (104) Remember the Ship, Additional Work Remember the Ship Crosswords Across 3 A prejudiced person who is intolerant of any opinions differing from his own (5) 4 Another word for language (6) 6 The
Læs mereDomestic violence - violence against women by men
ICASS 22 26 august 2008 Nuuk Domestic violence - violence against women by men Mariekathrine Poppel Email: mkp@ii.uni.gl Ilisimatusarfik University of Greenland Violence: : a concern in the Arctic? Artic
Læs mereListen Mr Oxford Don, Additional Work
57 (104) Listen Mr Oxford Don, Additional Work Listen Mr Oxford Don Crosswords Across 1 Attack someone physically or emotionally (7) 6 Someone who helps another person commit a crime (9) 7 Rob at gunpoint
Læs mere