Technical Report Series on Corpus Building

Størrelse: px
Starte visningen fra side:

Download "Technical Report Series on Corpus Building"

Transkript

1 Technical Report Series on Corpus Building Vol. 2 (March 2013) Danish Corpora Uwe Quasthoff Dirk Goldhahn Erla Hallsteinsdóttir Abteilung Automatische Sprachverarbeitung, Institut für Informatik, Universität Leipzig

2 Affiliation oft the authors: Uwe Quasthoff und Dirk Goldhahn: Institut für Informatik,Universität Leipzig {quasthoff, Erla Hallsteinsdóttir, Institut for Sprog og Kommunikation, Syddansk Universitet Odense, Copyright: Abteilung Automatische Sprachverarbeitung, Institut für Informatik, Universität Leipzig, Technical Report Series on Corpus Building Vol. 1: Deutscher Wortschatz 2013 Vol. 2: Danish Corpora This PDF document was created using the open source tool mwlib. For more infotmation, see PDF generated at: Tue, 15 May :19:38 UTC

3 Danish corpora 1 Introduction to corpus creation 1 DAN - a processing related language description 2 DAN corpora 4 DAN corpus comparison 8 Processing details 10 Appendix to dan news 2007: Database summary 10 Appendix to dan news 2008: Database summary 10 Appendix to dan news 2010: Database summary 11 Appendix to dan news 2011: Database summary 11 Appendix to dan newscrawl 2011: Database summary 12 Appendix to dan wikipedia 2007: Database summary 12 Appendix to dan wikipedia 2012: Database summary 13 Appendix to dan web 2002: Database summary 13 Appendix to dan web 2011: Database summary 14 Appendix to dan mixed 2012: Database summary 14 Content details 15 Appendix to dan news 2007: Size of different TLDs 15 Appendix to dan news 2008: Size of different TLDs 15 Appendix to dan news 2010: Size of different TLDs 16 Appendix to dan news 2011: Size of different TLDs 16 Appendix to dan newscrawl 2011: Size of different TLDs 17 Appendix to dan web 2002: Size of different TLDs 17 Appendix to dan web 2011: Size of different TLDs 17 Appendix to dan mixed 2012: Size of different TLDs 18 Appendix to dan news 2007: Size of largest domains 18 Appendix to dan news 2008: Size of largest domains 19 Appendix to dan news 2010: Size of largest domains 19 Appendix to dan news 2011: Size of largest domains 20 Appendix to dan newscrawl 2011: Size of largest domains 21 Appendix to dan web 2002: Size of largest domains 21

4 Appendix to dan web 2011: Size of largest domains 22 Appendix to dan mixed 2012: Size of largest domains 22 Appendix to dan news 2007: Number of sources by time period 23 Appendix to dan news 2008: Number of sources by time period 24 Appendix to dan news 2010: Number of sources by time period 25 Appendix to dan news 2011: Number of sources by time period 26 Word details 28 Appendix to dan news 2007: Words by length without multiplicity 28 Appendix to dan news 2008: Words by length without multiplicity 30 Appendix to dan news 2010: Words by length without multiplicity 32 Appendix to dan news 2011: Words by length without multiplicity 34 Appendix to dan newscrawl 2011: Words by length without multiplicity 36 Appendix to dan wikipedia 2007: Words by length without multiplicity 38 Appendix to dan wikipedia 2012: Words by length without multiplicity 40 Appendix to dan web 2002: Words by length without multiplicity 42 Appendix to dan web 2011: Words by length without multiplicity 44 Appendix to dan mixed 2012: Words by length without multiplicity 46 Appendix to dan news 2007: Words by length with multiplicity 48 Appendix to dan news 2008: Words by length with multiplicity 50 Appendix to dan news 2010: Words by length with multiplicity 52 Appendix to dan news 2011: Words by length with multiplicity 54 Appendix to dan newscrawl 2011: Words by length with multiplicity 56 Appendix to dan wikipedia 2007: Words by length with multiplicity 58 Appendix to dan wikipedia 2012: Words by length with multiplicity 60 Appendix to dan web 2002: Words by length with multiplicity 62 Appendix to dan web 2011: Words by length with multiplicity 64 Appendix to dan mixed 2012: Words by length with multiplicity 66 Appendix to dan news 2007: The most frequent 50 words 67 Appendix to dan news 2008: The most frequent 50 words 68 Appendix to dan news 2010: The most frequent 50 words 69 Appendix to dan news 2011: The most frequent 50 words 70 Appendix to dan newscrawl 2011: The most frequent 50 words 71 Appendix to dan wikipedia 2007: The most frequent 50 words 72 Appendix to dan wikipedia 2012: The most frequent 50 words 73 Appendix to dan web 2002: The most frequent 50 words 74 Appendix to dan web 2011: The most frequent 50 words 75 Appendix to dan mixed 2012: The most frequent 50 words 76

5 Appendix to dan news 2007: Longest words in top by rank 77 Appendix to dan news 2008: Longest words in top by rank 78 Appendix to dan news 2010: Longest words in top by rank 79 Appendix to dan news 2011: Longest words in top by rank 80 Appendix to dan newscrawl 2011: Longest words in top by rank 81 Appendix to dan wikipedia 2007: Longest words in top by rank 82 Appendix to dan wikipedia 2012: Longest words in top by rank 83 Appendix to dan web 2002: Longest words in top by rank 84 Appendix to dan web 2011: Longest words in top by rank 85 Appendix to dan mixed 2012: Longest words in top by rank 86 Character details 87 Appendix to dan news 2007: Alphabet as used in the top words 87 Appendix to dan news 2008: Alphabet as used in the top words 88 Appendix to dan news 2010: Alphabet as used in the top words 90 Appendix to dan news 2011: Alphabet as used in the top words 91 Appendix to dan newscrawl 2011: Alphabet as used in the top words 92 Appendix to dan wikipedia 2007: Alphabet as used in the top words 94 Appendix to dan wikipedia 2012: Alphabet as used in the top words 95 Appendix to dan web 2002: Alphabet as used in the top words 96 Appendix to dan web 2011: Alphabet as used in the top words 98 Appendix to dan mixed 2012: Alphabet as used in the top words 99 Abbreviation details 101 Appendix to dan news 2007: Most frequent abbreviations 101 Appendix to dan news 2008: Most frequent abbreviations 102 Appendix to dan news 2010: Most frequent abbreviations 103 Appendix to dan news 2011: Most frequent abbreviations 104 Appendix to dan newscrawl 2011: Most frequent abbreviations 105 Appendix to dan wikipedia 2007: Most frequent abbreviations 106 Appendix to dan wikipedia 2012: Most frequent abbreviations 107 Appendix to dan web 2002: Most frequent abbreviations 108 Appendix to dan web 2011: Most frequent abbreviations 109 Appendix to dan mixed 2012: Most frequent abbreviations 110 Appendix to dan news 2007: Left neighbors of the full stop 111 Appendix to dan news 2008: Left neighbors of the full stop 112 Appendix to dan news 2010: Left neighbors of the full stop 113 Appendix to dan news 2011: Left neighbors of the full stop 114

6 Appendix to dan newscrawl 2011: Left neighbors of the full stop 115 Appendix to dan wikipedia 2007: Left neighbors of the full stop 116 Appendix to dan wikipedia 2012: Left neighbors of the full stop 117 Appendix to dan web 2002: Left neighbors of the full stop 118 Appendix to dan web 2011: Left neighbors of the full stop 119 Appendix to dan mixed 2012: Left neighbors of the full stop 120 Appendix to dan news 2007: Left neighbors of the full stop with additional internal full stops 121 Appendix to dan news 2008: Left neighbors of the full stop with additional internal full stops 122 Appendix to dan news 2010: Left neighbors of the full stop with additional internal full stops 123 Appendix to dan news 2011: Left neighbors of the full stop with additional internal full stops 124 Appendix to dan newscrawl 2011: Left neighbors of the full stop with additional internal full stops 125 Appendix to dan wikipedia 2007: Left neighbors of the full stop with additional internal full stops 126 Appendix to dan wikipedia 2012: Left neighbors of the full stop with additional internal full stops 127 Appendix to dan web 2002: Left neighbors of the full stop with additional internal full stops 128 Appendix to dan web 2011: Left neighbors of the full stop with additional internal full stops 129 Appendix to dan mixed 2012: Left neighbors of the full stop with additional internal full stops 130 Sentences details 131 Appendix to dan news 2007: Shortest sentences 131 Appendix to dan news 2008: Shortest sentences 132 Appendix to dan news 2010: Shortest sentences 134 Appendix to dan news 2011: Shortest sentences 135 Appendix to dan newscrawl 2011: Shortest sentences 137 Appendix to dan wikipedia 2007: Shortest sentences 138 Appendix to dan wikipedia 2012: Shortest sentences 140 Appendix to dan web 2002: Shortest sentences 141 Appendix to dan web 2011: Shortest sentences 143 Appendix to dan mixed 2012: Shortest sentences 144 Appendix to dan news 2007: Longest sentences 146 Appendix to dan news 2008: Longest sentences 148 Appendix to dan news 2010: Longest sentences 150 Appendix to dan news 2011: Longest sentences 152 Appendix to dan newscrawl 2011: Longest sentences 154 Appendix to dan wikipedia 2007: Longest sentences 156 Appendix to dan wikipedia 2012: Longest sentences 158 Appendix to dan web 2002: Longest sentences 160 Appendix to dan web 2011: Longest sentences 162 Appendix to dan mixed 2012: Longest sentences 164

7 Appendix to dan news 2007: Length of sentences in characters 166 Appendix to dan news 2008: Length of sentences in characters 167 Appendix to dan news 2010: Length of sentences in characters 168 Appendix to dan news 2011: Length of sentences in characters 169 Appendix to dan newscrawl 2011: Length of sentences in characters 170 Appendix to dan wikipedia 2007: Length of sentences in characters 171 Appendix to dan wikipedia 2012: Length of sentences in characters 172 Appendix to dan web 2002: Length of sentences in characters 173 Appendix to dan web 2011: Length of sentences in characters 174 Appendix to dan mixed 2012: Length of sentences in characters 175 Appendix to dan news 2007: Length of sentences in words 176 Appendix to dan news 2008: Length of sentences in words 177 Appendix to dan news 2010: Length of sentences in words 178 Appendix to dan news 2011: Length of sentences in words 179 Appendix to dan newscrawl 2011: Length of sentences in words 180 Appendix to dan wikipedia 2007: Length of sentences in words 181 Appendix to dan wikipedia 2012: Length of sentences in words 182 Appendix to dan web 2002: Length of sentences in words 183 Appendix to dan web 2011: Length of sentences in words 184 Appendix to dan mixed 2012: Length of sentences in words 185 Oddities details 186 Appendix to dan news 2007: Longest words 186 Appendix to dan news 2008: Longest words 186 Appendix to dan news 2010: Longest words 187 Appendix to dan news 2011: Longest words 187 Appendix to dan newscrawl 2011: Longest words 188 Appendix to dan wikipedia 2007: Longest words 188 Appendix to dan wikipedia 2012: Longest words 189 Appendix to dan web 2002: Longest words 189 Appendix to dan web 2011: Longest words 190 Appendix to dan mixed 2012: Longest words 190 Appendix to dan news 2007: Sentences with high average word length 191 Appendix to dan news 2008: Sentences with high average word length 192 Appendix to dan news 2010: Sentences with high average word length 193 Appendix to dan news 2011: Sentences with high average word length 194 Appendix to dan newscrawl 2011: Sentences with high average word length 195 Appendix to dan wikipedia 2012: Sentences with high average word length 196

8 Appendix to dan news 2007: Problems with sentence segmentation - words ending in a stopword 197 Appendix to dan news 2008: Problems with sentence segmentation - words ending in a stopword 197 Appendix to dan news 2010: Problems with sentence segmentation - words ending in a stopword 198 Appendix to dan news 2011: Problems with sentence segmentation - words ending in a stopword 199 Appendix to dan newscrawl 2011: Problems with sentence segmentation - words ending in a stopword 200 Appendix to dan wikipedia 2007: Problems with sentence segmentation - words ending in a stopword 201 Appendix to dan wikipedia 2012: Problems with sentence segmentation - words ending in a stopword 201 Appendix to dan web 2002: Problems with sentence segmentation - words ending in a stopword 202 Appendix to dan web 2011: Problems with sentence segmentation - words ending in a stopword 203 Appendix to dan mixed 2012: Problems with sentence segmentation - words ending in a stopword 204

9 1 Danish corpora Introduction to corpus creation The Leipzig Corpora Collection (LCC) collects Web based corpora for many different languages. The main text genres are newspaper texts, Wikipedias and randomly collected web pages. All corpora are processed in the same way: Crawling Web pages HTML stripping Language identifikation Sentence segmentation Cleaning: Removal of ill-formed sentences Duplicate removal Calculation of word frequences and word co-occurrences As result we have a corpus containing only well-formed sentences in the language under consideration. The sentences are in random order; hence, sharing the corpus does not violate copyright law because it is impossible to reconstruct the original texts. The pre-processing steps contain both language independent steps (like HTML stripping and duplicate removal) and language dependent steps (like language identification and sentence segmentation). Especially the language specific parts are vulnerable to specific processing problems. The aim of the paper is to identify possible problems and evaluate the results. The following problems are adressed: A processing-focused language description Language size: How much text is available for this language? What are the biggest sources? Corpus description: Genre, size, crawling and processing date. Possible problems in language identification: Which languages are similar? Character set and alphabet Inspecting the word list: Most frequent words, longer high frequent words and longest words at all. Word length distribution. Can abbreviations confuse sentence segmentation? Information about the abbreviation list. Inspecting sentences: Inspect shortest and longest sentences to identify possible segmentation problems. Sentence length distribution. The paper describes the result of these inspections; the appendices show the exact results for the different corpora. This helps to compare the corpora with respect to quality. In the section quality overview, an overall quality description for each corpus is given. All corpora contain only minor problems which are irrelevant for most applications. Otherwise the corpus creation has been iterated.

10 DAN - a processing related language description 2 DAN - a processing related language description General properties of the Danish language Native Name: Dansk Classifiation: Indo-European, Germanic, North, East Scandinavian, Danish-Swedish, Danish-Riksmal, Danish Total Number of Speakers: 5.6M Largest countries with number of spakers: Denmark (5.6M) Source: / www. dst. dk/ en/ Statistik/ emner/ befolkning-og-befolkningsfremskrivning/ folketal. aspx Processing summary latin alphabet with some additional characters full stop is used as sentence boundary and for abbreviations apostrostophes used rarely Properties important for processing Alphabet and punctuation The alphabet is latin based, with the following specialities (sources: / en. wikipedia. org/ wiki/ Alphabets_derived_from_the_Latin and / en. wikipedia. org/ wiki/ Danish_and_Norwegian_alphabet): Danish includes all 26 base letters and Æ, Ø, Å Additional letter forms: É (a diacritic used for disamgiguation: en/et - én/ét) In foreign words: Á, À, Â, Ä, É, È, Ê, Ë, Í, Ì, Î, Ï, Ó, Ò, Ô, Ö, Ú, Ù, Û, Ü and more Additional digraphs: EE in foreign words (trainee, frisbee); AA in older texts (replaced by å in 1948) and names (Aalborg, Aarhus). NB! Aa is treetet like Å in alphabetical sorting in danish words only, meaning that Aabenraa is listet under Å (last letter of the alphabet) and Aachen under A. Å, Æ and Ø might occur as AA, AE and OE in newer texts (avoidance of language specific letters) Usual latin punctuation Usage of uppercase letters: At sentence beginnings and for proper names (of persons, organisations, countries etc.). When a word beginning with Aa is capitalized, only the first letter becomes capital, e.g. Aarhus. Sentence segmentation and word tokenization Sentence beginnings Sentences begin with a capitalized first word. Abbreviations Abbreviations confusing with sentence boundaries: Special abbreviation list has to be inspected. Sources for abbreviations: / www. dsn. dk/ retskrivning/ retskrivningsregler/ a / a / a7-42 and / www. dsn. dk/ sprogviden/ udgivelser/ sprognaevnets-skriftserie-1/ flere-udgivelser/ Rigtigt%20kort%20indskannet. pdf/ at_download/ file Abbreviations with full stop may appear in the word list without full stop. Apostrophes ( www. dsn. dk/ retskrivning/ retskrivningsregler/ a7-1-6/ a7-6)

11 DAN - a processing related language description 3 Use of apostrophes: infrequent. in elliptical forms like "bli'", "hva'", "ha'", "ka'" and "la'r" instead of "blive", "hvad", "have", "kan" and "lader" (Bitte überprüfen, warum nach "ha'" immer ein ";" steht, dies passt nicht) to mark combination of a word/radical and inflectional endings: in combination of definite article: euro'en, PC'en, SMS'erne, OP'ens, CD-ROM'en used to mark genitive (instead of "s") in words that end with the letters s, z or x: Marx's ven Wilhelm Liebknecht, Georg Brandes' Plads to mark a genitive or plural form with "s": Jan's, foto's (both incorrect but frequent usages), and, in certain cases, other inflectional endings on proper names: Albert'er (2x Albert), Alberte'r (2x Alberte), Borges'ske dimensioner, Crohn's sygdom in combination with english (or other foreign) words: chicken satay's, Google's brugsoplevelser to mark combination of numerals and inflectional endings: 60'er-rock to mark combination of foreign words ending on "-ee" and inflectional endings: frisbee'en, yankee'er Mainly used to mark citations Sources and ranking (2012) Estimated number of webpages containing text Google.com top-5 words: results for "i" "og" "at" "er" "på" Google.com top-10 words: results for "i" "og" "at" "er" "på" "til" "en" "af" "for" "med" Rank according to number of speakers (Ethnologue): 111 Rank according to Wikipedia size (see / de. wikipedia. org/ wiki/ Wikipedia:Sprachen): Rank 30 with articles. Rank according to number of newspapers as found by AbyZ (5/2012): 160 newspapers, rank 15. Rank according to number of newspapers with RSS feeds (5/2012): 110 newspapers, rank 14. Rank according to our corpus size (9/2012): 19

12 DAN corpora 4 DAN corpora Quality Overview Quality Ratings A: Very good quality. Ready to use (or already used) for frequency dictionary. Size as large as possible Only minimal errors Multiple genres (if possible) A-: Small problems identified. They should not affect usage. B: Native speaker quality. Information about abbreviations and sentence boundaries by native speaker Resulting statistics checked by native speaker, possible errors corrected C: Non-native speaker quality Obvious problems shown in corpus statistics are corrected D: First version Pre-processing with default abbreviation list and default sentence boundaries E: Poor Quality: Old, outdated or faulty. Corpus Quality The quality of the corpora differes slightly because the corpus processing toolchain changed slightly during several years. Moreover, original data are often no more available. Hence, improvement of quality often means removing incomplete or doubtful sentences. Forthcoming editions of all corpora thus might have a slightly smaller number of sentences. This especially applies to near duplicate sentences which are removed only sparingly. The following table shows the quality of the corpora. Minimal errors are still possible and described in the sections below. All possible major improvements are mentioned here. Corpus Quality rating Known problems to-dos dan_news_2007 A- near duplicates, see sentence length distibution - dan_news_2008 A - - dan_news_2010 A - - dan_news_2011 A - - dan_newscrawl_2011 A - - dan_wikipedia_2007 A- near duplicates, see sentence length distibution - dan_wikipedia_2012 A - - dan_web_2002 A - - dan_web_2011 A - - dan_mixed_2012 A - -

13 DAN corpora 5 Processing Overview For more details, see Appendix: Database Summary and Appendix: Number of sources by time period. Corpus Size (M sentences) Size (M running words) Multiwords Crawling date Production date dan_news_ / dan_news_ / dan_news_ / dan_news_ dayly dan_newscrawl_ batch crawling 2012 dan_wikipedia_ dump dan_wikipedia_ dump dan_web_ randomly dan_web_ randomly dan_mixed_ Content Overview For more details, see Appendix: Size of different TLDs and Appendix: Size of different domains. Corpus Type of sources Countries Number of sources Publishing date Biggest source dan_news_2007 News dk 42 newspapers dan_news_2008 News dk 56 newspapers dan_news_2010 News dk 45 newspapers dan_news_2011 News dk 36 newspapers dan_newscrawl_2011 News dk 73 newspapers 2011 and before dan_wikipedia_2007 Wikipedia dan_wikipedia_2012 Wikipedia dan_web_2002 Web dk domains 2002 and before dan_web_2011 Web dk domains 2011 and before aarhus.lokalavisen.dk/ dan_mixed_2012 combined combined domains 2011 and before Words Appendix: Words by Length without multiplicity and Appendix: Words by Length without multiplicity show the length distribution for words. The curves should be smooth and decreasing for length>=5. Appendix: The Most Frequent 50 Words shows the most frequent stopwords as well as one or more words related to the region. Appendix: Longest Words in Top-1000 by rank shows the 25 longest words within the top They usually give an impression of the main topics treated in the corpus. Appendix: Longest Words with minimum frequency 2 should give an idea of very long words. In the case of processing problems, different types of non-words may appear. This might help to improve the word definition.

14 DAN corpora 6 Corpus Word length graph without multiplicity Word length graph with multiplicity Most Frequent 50 Words Longest Words in Top-1000 Longest Words with minimum frequency 2 dan_news_2007 okay okay, min. avg okay okay URLs, routes dan_news_2008 okay okay okay okay okay dan_news_2010 okay okay okay okay URLs, routes dan_news_2011 okay okay okay okay missing blanks, hex strings dan_newscrawl_2011 okay okay Publiceret and.. okay URLs, routes, missing blanks dan_wikipedia_2007 okay, min. avg okay, max. avg okay okay URLs, routes dan_wikipedia_2012 okay okay okay okay URLs dan_web_2002 okay, max. avg okay okay okay missing blanks, special characters dan_web_2011 okay okay okay okay missing blanks, special characters dan_mixed_2012 okay okay okay okay all errors as above Remarks The average word length (without multiplicity) differs for the different text genres. There is an unexpected minimum in the length distribution (with multiplicity) for length 4. Abbreviations For sentence boundary detection, abbreviations ending in a full stop are of interest: Such abbreviations are usually not used as sentence boundaries. Conversely, missing abbreviations can overgenerate sentence boundaries. The list of abbreviations is of high quality: nearly complete and manually checked. Due to limitations in the processing chain, this list of abbreviations is only used for sentence boundary detection and not included in the word list. Hence, abbreviations ending with a full stop appear in the word list without the full stop. Sentences Appendix: Shortest sentences shows the shortest declarative, exclamatory and interrogative sentences. In preprocessing, a minimal length for sentences might be specified. And missing abbreviations are often visible as faulty sentence endings. Appendix: Longest sentences shows the longest declarative, exclamatory and interrogative sentences. Usually, the maximun sentence length is defined as 256 characters (not 256 bytes). Very long exclamatory or interrogative sentences often contain an overseen sentence boundary. Appendix: Length of sentences in characters shows the distribution of the sentence length. A large and balanced corpus will result in a smooth and bell-shaped curve. Isolated local maxima usually result from large sets of near duplicate sentences.

15 DAN corpora 7 Corpus Shortest sentences Longest sentences Length distribution (in characters) dan_news_2007 unsymmetric quotation marks okay near duplicate peak at 48 dan_news_2008 some unsymmetric quotation marks okay sentences longer than 255? dan_news_2010 okay 1 menu list, 2x hex data near duplicate peak at 42? Length distribution (in words) okay okay okay dan_news_2011 duplicate sentences declarative sentences with many time data near duplicate peak at 42 okay dan_newscrawl_2011 okay declarative sentences with many time data many near duplicate peaks many near duplicate peaks dan_wikipedia_2007 declarative sentences beginning with digits and ending with abbrev. okay near duplicate peak at 20 sharp maximum at 10 dan_wikipedia_2012 okay okay okay okay dan_web_2002 dan_web_2011 dan_mixed_2012 declarative non-sentences, interrogative sentences beginning lowercase or with blank Lowercase beginnings for declarative sentences okay very smooth okay Enumerations, multiple sentences max. 277 characters Oddities Appendix: Sentences with high average word length: Average sentences contain many stopwords, and these stopwords are usually short. Hence, they restrict the average word length in a sentence. Conversely, sentences with high average word length are often ill formed. They may be used to improve pre-processing. Appendix: Problems with sentence segmentation - Words ending in a stopword: If there are many ill-formed word or sentence boundaries witout a blank between two words, they will generate new ill-formed words. The appendix shows the most frequent words ending in an uppercase stopword. If they are infrequent then the date were of high quality. Corpus Sentences with high average word length Words ending in a stopword... dan_news_2007 all kinds of errors okay dan_news_2008 okay okay dan_news_2010 2x hex strings maxfreq=11 dan_news_2011 2x hex strings, 2x missing blanks maxfreq=24 dan_newscrawl_2011 1x missing blanks maxfreq=805 dan_wikipedia_2007 (no data) okay dan_wikipedia_2012 okay maxfreq=27 dan_web_2002 missing blanks, underscores maxfreq=67 dan_web_2011 missing blanks maxfreq=58 dan_mixed_2012 missing blanks, underscores words containing ";"

16 DAN corpus comparison 8 DAN corpus comparison Automated Corpus comparison For the conducted comparisons, the following tests on the top-1000 words are performed: Vectors based on the frequencies of the top-1000 words are created for the analysed languages. As similarity value, 1-cos(alpha) of the angle alpha between these vectors is computed. Identical languages receive a value of 0, distinct languages get a value of 1. The same analysis is conducted using the frequencies of the top-1000 typical letter trigrams of the languages. Monolingual word list comparison (top-1000 words) As one can expect the comparisons show: The different news corpora have word lists with maximum distance 0.19 (dan_newscrawl_2011 and dan_news_2008) The web corpora have word lists with distance 0.13 The wikipedia corpora are similar with distance 0.10 The biggest distance of 0.36 can be found between dan_wikipedia_2007 dan_news_2008 The mixed corpus dan_mixed_2012 has a central position within the corpora and has a maximum distance of 0.31 to the wikipedia_2007 corpus Multilingual word list comparison (top-1000 words) Both the comparison of the top-1000 words and the comparison of the letter trigrams used in these words were conducted to find the most similar languages based on these features. The distance of Danish to the next languages considering words is 0.47 to Swedish. Considering letter trigrams the nearest language with distance 0.38 is Bokmål. These distances are below average. On average the value for the most similar language to a language in question is 0.58 for trigrams. The most similar languages based on words: Swedish, Bokmål, Nynorsk source language_short_name language_name cos_logfreq dan swe Swedish dan nob Norwegian, Bokmål dan nno Norwegian, Nynorsk dan fao Faroese dan isl Icelandic The most similar languages based on letter trigrams: Bokmål, Swedish, Dutch source language_short_name language_name cos_logfreq dan nob Norwegian, Bokmål dan swe Swedish dan nld Dutch dan nno Norwegian, Nynorsk

17 DAN corpus comparison 9 dan deu German

18 10 Processing details Appendix to dan news 2007: Database summary Values for some general parameters Parameter Value Number of sentences Number of running word forms Number of distinct word forms Number of multiwords 0 Percentage of words with frequency= Number of sentence based co-occurrences Number of neighbour co-occurrences Appendix to dan news 2008: Database summary Values for some general parameters Parameter Value Number of sentences Number of running word forms Number of distinct word forms Number of multiwords Percentage of words with frequency= Number of sentence based co-occurrences Number of neighbour co-occurrences

19 Appendix to dan news 2010: Database summary 11 Appendix to dan news 2010: Database summary Values for some general parameters Parameter Value Number of sentences Number of running word forms Number of distinct word forms Number of multiwords 9696 Percentage of words with frequency= Number of sentence based co-occurrences Number of neighbour co-occurrences Appendix to dan news 2011: Database summary Values for some general parameters Parameter Value Number of sentences Number of running word forms Number of distinct word forms Number of multiwords Percentage of words with frequency= Number of sentence based co-occurrences Number of neighbour co-occurrences

20 Appendix to dan newscrawl 2011: Database summary 12 Appendix to dan newscrawl 2011: Database summary Values for some general parameters Parameter Value Number of sentences Number of running word forms Number of distinct word forms Number of multiwords Percentage of words with frequency= Number of sentence based co-occurrences Number of neighbour co-occurrences Appendix to dan wikipedia 2007: Database summary Values for some general parameters Parameter Value Number of sentences Number of running word forms Number of distinct word forms Number of multiwords Percentage of words with frequency= Number of sentence based co-occurrences Number of neighbour co-occurrences

Unitel EDI MT940 June 2010. Based on: SWIFT Standards - Category 9 MT940 Customer Statement Message (January 2004)

Unitel EDI MT940 June 2010. Based on: SWIFT Standards - Category 9 MT940 Customer Statement Message (January 2004) Unitel EDI MT940 June 2010 Based on: SWIFT Standards - Category 9 MT940 Customer Statement Message (January 2004) Contents 1. Introduction...3 2. General...3 3. Description of the MT940 message...3 3.1.

Læs mere

Help / Hjælp

Help / Hjælp Home page Lisa & Petur www.lisapetur.dk Help / Hjælp Help / Hjælp General The purpose of our Homepage is to allow external access to pictures and videos taken/made by the Gunnarsson family. The Association

Læs mere

Privat-, statslig- eller regional institution m.v. Andet Added Bekaempelsesudfoerende: string No Label: Bekæmpelsesudførende

Privat-, statslig- eller regional institution m.v. Andet Added Bekaempelsesudfoerende: string No Label: Bekæmpelsesudførende Changes for Rottedatabasen Web Service The coming version of Rottedatabasen Web Service will have several changes some of them breaking for the exposed methods. These changes and the business logic behind

Læs mere

Financial Literacy among 5-7 years old children

Financial Literacy among 5-7 years old children Financial Literacy among 5-7 years old children -based on a market research survey among the parents in Denmark, Sweden, Norway, Finland, Northern Ireland and Republic of Ireland Page 1 Purpose of the

Læs mere

Generalized Probit Model in Design of Dose Finding Experiments. Yuehui Wu Valerii V. Fedorov RSU, GlaxoSmithKline, US

Generalized Probit Model in Design of Dose Finding Experiments. Yuehui Wu Valerii V. Fedorov RSU, GlaxoSmithKline, US Generalized Probit Model in Design of Dose Finding Experiments Yuehui Wu Valerii V. Fedorov RSU, GlaxoSmithKline, US Outline Motivation Generalized probit model Utility function Locally optimal designs

Læs mere

Skriftlig Eksamen Beregnelighed (DM517)

Skriftlig Eksamen Beregnelighed (DM517) Skriftlig Eksamen Beregnelighed (DM517) Institut for Matematik & Datalogi Syddansk Universitet Mandag den 31 Oktober 2011, kl. 9 13 Alle sædvanlige hjælpemidler (lærebøger, notater etc.) samt brug af lommeregner

Læs mere

The X Factor. Målgruppe. Læringsmål. Introduktion til læreren klasse & ungdomsuddannelser Engelskundervisningen

The X Factor. Målgruppe. Læringsmål. Introduktion til læreren klasse & ungdomsuddannelser Engelskundervisningen The X Factor Målgruppe 7-10 klasse & ungdomsuddannelser Engelskundervisningen Læringsmål Eleven kan give sammenhængende fremstillinger på basis af indhentede informationer Eleven har viden om at søge og

Læs mere

Engelsk. Niveau D. De Merkantile Erhvervsuddannelser September Casebaseret eksamen. og

Engelsk. Niveau D. De Merkantile Erhvervsuddannelser September Casebaseret eksamen.  og 052431_EngelskD 08/09/05 13:29 Side 1 De Merkantile Erhvervsuddannelser September 2005 Side 1 af 4 sider Casebaseret eksamen Engelsk Niveau D www.jysk.dk og www.jysk.com Indhold: Opgave 1 Presentation

Læs mere

CHAPTER 8: USING OBJECTS

CHAPTER 8: USING OBJECTS Ruby: Philosophy & Implementation CHAPTER 8: USING OBJECTS Introduction to Computer Science Using Ruby Ruby is the latest in the family of Object Oriented Programming Languages As such, its designer studied

Læs mere

Vores mange brugere på musskema.dk er rigtig gode til at komme med kvalificerede ønsker og behov.

Vores mange brugere på musskema.dk er rigtig gode til at komme med kvalificerede ønsker og behov. På dansk/in Danish: Aarhus d. 10. januar 2013/ the 10 th of January 2013 Kære alle Chefer i MUS-regi! Vores mange brugere på musskema.dk er rigtig gode til at komme med kvalificerede ønsker og behov. Og

Læs mere

User Manual for LTC IGNOU

User Manual for LTC IGNOU User Manual for LTC IGNOU 1 LTC (Leave Travel Concession) Navigation: Portal Launch HCM Application Self Service LTC Self Service 1. LTC Advance/Intimation Navigation: Launch HCM Application Self Service

Læs mere

Portal Registration. Check Junk Mail for activation . 1 Click the hyperlink to take you back to the portal to confirm your registration

Portal Registration. Check Junk Mail for activation  . 1 Click the hyperlink to take you back to the portal to confirm your registration Portal Registration Step 1 Provide the necessary information to create your user. Note: First Name, Last Name and Email have to match exactly to your profile in the Membership system. Step 2 Click on the

Læs mere

Engelsk. Niveau C. De Merkantile Erhvervsuddannelser September 2005. Casebaseret eksamen. www.jysk.dk og www.jysk.com.

Engelsk. Niveau C. De Merkantile Erhvervsuddannelser September 2005. Casebaseret eksamen. www.jysk.dk og www.jysk.com. 052430_EngelskC 08/09/05 13:29 Side 1 De Merkantile Erhvervsuddannelser September 2005 Side 1 af 4 sider Casebaseret eksamen Engelsk Niveau C www.jysk.dk og www.jysk.com Indhold: Opgave 1 Presentation

Læs mere

Vina Nguyen HSSP July 13, 2008

Vina Nguyen HSSP July 13, 2008 Vina Nguyen HSSP July 13, 2008 1 What does it mean if sets A, B, C are a partition of set D? 2 How do you calculate P(A B) using the formula for conditional probability? 3 What is the difference between

Læs mere

Aktivering af Survey funktionalitet

Aktivering af Survey funktionalitet Surveys i REDCap REDCap gør det muligt at eksponere ét eller flere instrumenter som et survey (spørgeskema) som derefter kan udfyldes direkte af patienten eller forsøgspersonen over internettet. Dette

Læs mere

On the complexity of drawing trees nicely: corrigendum

On the complexity of drawing trees nicely: corrigendum Acta Informatica 40, 603 607 (2004) Digital Object Identifier (DOI) 10.1007/s00236-004-0138-y On the complexity of drawing trees nicely: corrigendum Thorsten Akkerman, Christoph Buchheim, Michael Jünger,

Læs mere

PARALLELIZATION OF ATTILA SIMULATOR WITH OPENMP MIGUEL ÁNGEL MARTÍNEZ DEL AMOR MINIPROJECT OF TDT24 NTNU

PARALLELIZATION OF ATTILA SIMULATOR WITH OPENMP MIGUEL ÁNGEL MARTÍNEZ DEL AMOR MINIPROJECT OF TDT24 NTNU PARALLELIZATION OF ATTILA SIMULATOR WITH OPENMP MIGUEL ÁNGEL MARTÍNEZ DEL AMOR MINIPROJECT OF TDT24 NTNU OUTLINE INEFFICIENCY OF ATTILA WAYS TO PARALLELIZE LOW COMPATIBILITY IN THE COMPILATION A SOLUTION

Læs mere

Website review groweasy.dk

Website review groweasy.dk Website review groweasy.dk Generated on September 01 2016 10:32 AM The score is 56/100 SEO Content Title Webbureau Odense GrowEasy hjælper dig med digital markedsføring! Length : 66 Perfect, your title

Læs mere

To the reader: Information regarding this document

To the reader: Information regarding this document To the reader: Information regarding this document All text to be shown to respondents in this study is going to be in Danish. The Danish version of the text (the one, respondents are going to see) appears

Læs mere

Statistical information form the Danish EPC database - use for the building stock model in Denmark

Statistical information form the Danish EPC database - use for the building stock model in Denmark Statistical information form the Danish EPC database - use for the building stock model in Denmark Kim B. Wittchen Danish Building Research Institute, SBi AALBORG UNIVERSITY Certification of buildings

Læs mere

Nyhedsmail, december 2013 (scroll down for English version)

Nyhedsmail, december 2013 (scroll down for English version) Nyhedsmail, december 2013 (scroll down for English version) Kære Omdeler Julen venter rundt om hjørnet. Og netop julen er årsagen til, at NORDJYSKE Distributions mange omdelere har ekstra travlt med at

Læs mere

Learnings from the implementation of Epic

Learnings from the implementation of Epic Learnings from the implementation of Epic Appendix Picture from Region H (2016) A thesis report by: Oliver Metcalf-Rinaldo, oliv@itu.dk Stephan Mosko Jensen, smos@itu.dk Appendix - Table of content Appendix

Læs mere

Bilag. Resume. Side 1 af 12

Bilag. Resume. Side 1 af 12 Bilag Resume I denne opgave, lægges der fokus på unge og ensomhed gennem sociale medier. Vi har i denne opgave valgt at benytte Facebook som det sociale medie vi ligger fokus på, da det er det største

Læs mere

GUIDE TIL BREVSKRIVNING

GUIDE TIL BREVSKRIVNING GUIDE TIL BREVSKRIVNING APPELBREVE Formålet med at skrive et appelbrev er at få modtageren til at overholde menneskerettighederne. Det er en god idé at lægge vægt på modtagerens forpligtelser over for

Læs mere

Basic statistics for experimental medical researchers

Basic statistics for experimental medical researchers Basic statistics for experimental medical researchers Sample size calculations September 15th 2016 Christian Pipper Department of public health (IFSV) Faculty of Health and Medicinal Science (SUND) E-mail:

Læs mere

Hvor er mine runde hjørner?

Hvor er mine runde hjørner? Hvor er mine runde hjørner? Ofte møder vi fortvivlelse blandt kunder, når de ser deres nye flotte site i deres browser og indser, at det ser anderledes ud, i forhold til det design, de godkendte i starten

Læs mere

Trolling Master Bornholm 2014

Trolling Master Bornholm 2014 Trolling Master Bornholm 2014 (English version further down) Den ny havn i Tejn Havn Bornholms Regionskommune er gået i gang med at udvide Tejn Havn, og det er med til at gøre det muligt, at vi kan være

Læs mere

Trolling Master Bornholm 2012

Trolling Master Bornholm 2012 Trolling Master Bornholm 1 (English version further down) Tak for denne gang Det var en fornøjelse især jo også fordi vejret var med os. Så heldig har vi aldrig været før. Vi skal evaluere 1, og I må meget

Læs mere

Sport for the elderly

Sport for the elderly Sport for the elderly - Teenagers of the future Play the Game 2013 Aarhus, 29 October 2013 Ditte Toft Danish Institute for Sports Studies +45 3266 1037 ditte.toft@idan.dk A growing group in the population

Læs mere

DoodleBUGS (Hands-on)

DoodleBUGS (Hands-on) DoodleBUGS (Hands-on) Simple example: Program: bino_ave_sim_doodle.odc A simulation example Generate a sample from F=(r1+r2)/2 where r1~bin(0.5,200) and r2~bin(0.25,100) Note that E(F)=(100+25)/2=62.5

Læs mere

Skriftlig Eksamen Beregnelighed (DM517)

Skriftlig Eksamen Beregnelighed (DM517) Skriftlig Eksamen Beregnelighed (DM517) Institut for Matematik & Datalogi Syddansk Universitet Mandag den 7 Januar 2008, kl. 9 13 Alle sædvanlige hjælpemidler (lærebøger, notater etc.) samt brug af lommeregner

Læs mere

SAS Corporate Program Website

SAS Corporate Program Website SAS Corporate Program Website Dear user We have developed SAS Corporate Program Website to make the administration of your company's travel activities easier. You can read about it in this booklet, which

Læs mere

Project Step 7. Behavioral modeling of a dual ported register set. 1/8/ L11 Project Step 5 Copyright Joanne DeGroat, ECE, OSU 1

Project Step 7. Behavioral modeling of a dual ported register set. 1/8/ L11 Project Step 5 Copyright Joanne DeGroat, ECE, OSU 1 Project Step 7 Behavioral modeling of a dual ported register set. Copyright 2006 - Joanne DeGroat, ECE, OSU 1 The register set Register set specifications 16 dual ported registers each with 16- bit words

Læs mere

Skriftlig Eksamen Kombinatorik, Sandsynlighed og Randomiserede Algoritmer (DM528)

Skriftlig Eksamen Kombinatorik, Sandsynlighed og Randomiserede Algoritmer (DM528) Skriftlig Eksamen Kombinatorik, Sandsynlighed og Randomiserede Algoritmer (DM58) Institut for Matematik og Datalogi Syddansk Universitet, Odense Torsdag den 1. januar 01 kl. 9 13 Alle sædvanlige hjælpemidler

Læs mere

Special VFR. - ved flyvning til mindre flyveplads uden tårnkontrol som ligger indenfor en kontrolzone

Special VFR. - ved flyvning til mindre flyveplads uden tårnkontrol som ligger indenfor en kontrolzone Special VFR - ved flyvning til mindre flyveplads uden tårnkontrol som ligger indenfor en kontrolzone SERA.5005 Visual flight rules (a) Except when operating as a special VFR flight, VFR flights shall be

Læs mere

Trolling Master Bornholm 2016 Nyhedsbrev nr. 8

Trolling Master Bornholm 2016 Nyhedsbrev nr. 8 Trolling Master Bornholm 2016 Nyhedsbrev nr. 8 English version further down Der bliver landet fisk men ikke mange Her er det Johnny Nielsen, Søløven, fra Tejn, som i denne uge fangede 13,0 kg nord for

Læs mere

Besvarelser til Lineær Algebra Reeksamen Februar 2017

Besvarelser til Lineær Algebra Reeksamen Februar 2017 Besvarelser til Lineær Algebra Reeksamen - 7. Februar 207 Mikkel Findinge Bemærk, at der kan være sneget sig fejl ind. Kontakt mig endelig, hvis du skulle falde over en sådan. Dette dokument har udelukkende

Læs mere

ECE 551: Digital System * Design & Synthesis Lecture Set 5

ECE 551: Digital System * Design & Synthesis Lecture Set 5 ECE 551: Digital System * Design & Synthesis Lecture Set 5 5.1: Verilog Behavioral Model for Finite State Machines (FSMs) 5.2: Verilog Simulation I/O and 2001 Standard (In Separate File) 3/4/2003 1 ECE

Læs mere

Trolling Master Bornholm 2016 Nyhedsbrev nr. 3

Trolling Master Bornholm 2016 Nyhedsbrev nr. 3 Trolling Master Bornholm 2016 Nyhedsbrev nr. 3 English version further down Den første dag i Bornholmerlaks konkurrencen Formanden for Bornholms Trollingklub, Anders Schou Jensen (og meddomer i TMB) fik

Læs mere

Trolling Master Bornholm 2016 Nyhedsbrev nr. 7

Trolling Master Bornholm 2016 Nyhedsbrev nr. 7 Trolling Master Bornholm 2016 Nyhedsbrev nr. 7 English version further down Så var det omsider fiskevejr En af dem, der kom på vandet i en af hullerne, mellem den hårde vestenvind var Lejf K. Pedersen,

Læs mere

Trolling Master Bornholm 2014

Trolling Master Bornholm 2014 Trolling Master Bornholm 2014 (English version further down) Ny præmie Trolling Master Bornholm fylder 10 år næste gang. Det betyder, at vi har fundet på en ny og ganske anderledes præmie. Den fisker,

Læs mere

Bookingmuligheder for professionelle brugere i Dansehallerne 2015-16

Bookingmuligheder for professionelle brugere i Dansehallerne 2015-16 Bookingmuligheder for professionelle brugere i Dansehallerne 2015-16 Modtager man økonomisk støtte til et danseprojekt, har en premieredato og er professionel bruger af Dansehallerne har man mulighed for

Læs mere

Trolling Master Bornholm 2013

Trolling Master Bornholm 2013 Trolling Master Bornholm 2013 (English version further down) Tilmeldingen åbner om to uger Mandag den 3. december kl. 8.00 åbner tilmeldingen til Trolling Master Bornholm 2013. Vi har flere tilmeldinger

Læs mere

Strings and Sets: set complement, union, intersection, etc. set concatenation AB, power of set A n, A, A +

Strings and Sets: set complement, union, intersection, etc. set concatenation AB, power of set A n, A, A + Strings and Sets: A string over Σ is any nite-length sequence of elements of Σ The set of all strings over alphabet Σ is denoted as Σ Operators over set: set complement, union, intersection, etc. set concatenation

Læs mere

ESG reporting meeting investors needs

ESG reporting meeting investors needs ESG reporting meeting investors needs Carina Ohm Nordic Head of Climate Change and Sustainability Services, EY DIRF dagen, 24 September 2019 Investors have growing focus on ESG EY Investor Survey 2018

Læs mere

Danish Language Course for International University Students Copenhagen, 12 July 1 August Application form

Danish Language Course for International University Students Copenhagen, 12 July 1 August Application form Danish Language Course for International University Students Copenhagen, 12 July 1 August 2017 Application form Must be completed on the computer in Danish or English All fields are mandatory PERSONLIGE

Læs mere

Applications. Computational Linguistics: Jordan Boyd-Graber University of Maryland RL FOR MACHINE TRANSLATION. Slides adapted from Phillip Koehn

Applications. Computational Linguistics: Jordan Boyd-Graber University of Maryland RL FOR MACHINE TRANSLATION. Slides adapted from Phillip Koehn Applications Slides adapted from Phillip Koehn Computational Linguistics: Jordan Boyd-Graber University of Maryland RL FOR MACHINE TRANSLATION Computational Linguistics: Jordan Boyd-Graber UMD Applications

Læs mere

Dendrokronologisk Laboratorium

Dendrokronologisk Laboratorium Dendrokronologisk Laboratorium NNU rapport 14, 2001 ROAGER KIRKE, TØNDER AMT Nationalmuseet og Den Antikvariske Samling i Ribe. Undersøgt af Orla Hylleberg Eriksen. NNU j.nr. A5712 Foto: P. Kristiansen,

Læs mere

Heuristics for Improving

Heuristics for Improving Heuristics for Improving Model Learning Based Testing Muhammad Naeem Irfan VASCO-LIG LIG, Computer Science Lab, Grenoble Universities, 38402 Saint Martin d Hères France Introduction Component Based Software

Læs mere

Danish Language Course for Foreign University Students Copenhagen, 13 July 2 August 2016 Advanced, medium and beginner s level.

Danish Language Course for Foreign University Students Copenhagen, 13 July 2 August 2016 Advanced, medium and beginner s level. Danish Language Course for Foreign University Students Copenhagen, 13 July 2 August 2016 Advanced, medium and beginner s level Application form Must be completed on the computer in Danish or English All

Læs mere

applies equally to HRT and tibolone this should be made clear by replacing HRT with HRT or tibolone in the tibolone SmPC.

applies equally to HRT and tibolone this should be made clear by replacing HRT with HRT or tibolone in the tibolone SmPC. Annex I English wording to be implemented SmPC The texts of the 3 rd revision of the Core SPC for HRT products, as published on the CMD(h) website, should be included in the SmPC. Where a statement in

Læs mere

IBM Network Station Manager. esuite 1.5 / NSM Integration. IBM Network Computer Division. tdc - 02/08/99 lotusnsm.prz Page 1

IBM Network Station Manager. esuite 1.5 / NSM Integration. IBM Network Computer Division. tdc - 02/08/99 lotusnsm.prz Page 1 IBM Network Station Manager esuite 1.5 / NSM Integration IBM Network Computer Division tdc - 02/08/99 lotusnsm.prz Page 1 New esuite Settings in NSM The Lotus esuite Workplace administration option is

Læs mere

Velkommen til IFF QA erfa møde d. 15. marts Erfaringer med miljømonitorering og tolkning af nyt anneks 1.

Velkommen til IFF QA erfa møde d. 15. marts Erfaringer med miljømonitorering og tolkning af nyt anneks 1. Velkommen til IFF QA erfa møde d. 15. marts 2018 Erfaringer med miljømonitorering og tolkning af nyt anneks 1. 1 Fast agenda kl.16.30-18.00 1. Nyt fra kurser, seminarer, myndighedsinspektioner, audit som

Læs mere

Central Statistical Agency.

Central Statistical Agency. Central Statistical Agency www.csa.gov.et 1 Outline Introduction Characteristics of Construction Aim of the Survey Methodology Result Conclusion 2 Introduction Meaning of Construction Construction may

Læs mere

Subject to terms and conditions. WEEK Type Price EUR WEEK Type Price EUR WEEK Type Price EUR WEEK Type Price EUR

Subject to terms and conditions. WEEK Type Price EUR WEEK Type Price EUR WEEK Type Price EUR WEEK Type Price EUR ITSO SERVICE OFFICE Weeks for Sale 31/05/2015 m: +34 636 277 307 w: clublasanta-timeshare.com e: roger@clublasanta.com See colour key sheet news: rogercls.blogspot.com Subject to terms and conditions THURSDAY

Læs mere

DK - Quick Text Translation. HEYYER Net Promoter System Magento extension

DK - Quick Text Translation. HEYYER Net Promoter System Magento extension DK - Quick Text Translation HEYYER Net Promoter System Magento extension Version 1.0 15-11-2013 HEYYER / Email Templates Invitation Email Template Invitation Email English Dansk Title Invitation Email

Læs mere

ATEX direktivet. Vedligeholdelse af ATEX certifikater mv. Steen Christensen stec@teknologisk.dk www.atexdirektivet.

ATEX direktivet. Vedligeholdelse af ATEX certifikater mv. Steen Christensen stec@teknologisk.dk www.atexdirektivet. ATEX direktivet Vedligeholdelse af ATEX certifikater mv. Steen Christensen stec@teknologisk.dk www.atexdirektivet.dk tlf: 7220 2693 Vedligeholdelse af Certifikater / tekniske dossier / overensstemmelseserklæringen.

Læs mere

Linear Programming ١ C H A P T E R 2

Linear Programming ١ C H A P T E R 2 Linear Programming ١ C H A P T E R 2 Problem Formulation Problem formulation or modeling is the process of translating a verbal statement of a problem into a mathematical statement. The Guidelines of formulation

Læs mere

Trolling Master Bornholm 2015

Trolling Master Bornholm 2015 Trolling Master Bornholm 2015 (English version further down) Sæsonen er ved at komme i omdrejninger. Her er det John Eriksen fra Nexø med 95 cm og en kontrolleret vægt på 11,8 kg fanget på østkysten af

Læs mere

Mandara. PebbleCreek. Tradition Series. 1,884 sq. ft robson.com. Exterior Design A. Exterior Design B.

Mandara. PebbleCreek. Tradition Series. 1,884 sq. ft robson.com. Exterior Design A. Exterior Design B. Mandara 1,884 sq. ft. Tradition Series Exterior Design A Exterior Design B Exterior Design C Exterior Design D 623.935.6700 robson.com Tradition OPTIONS Series Exterior Design A w/opt. Golf Cart Garage

Læs mere

Statistik for MPH: 7

Statistik for MPH: 7 Statistik for MPH: 7 3. november 2011 www.biostat.ku.dk/~pka/mph11 Attributable risk, bestemmelse af stikprøvestørrelse (Silva: 333-365, 381-383) Per Kragh Andersen 1 Fra den 6. uges statistikundervisning:

Læs mere

Sikkerhed & Revision 2013

Sikkerhed & Revision 2013 Sikkerhed & Revision 2013 Samarbejde mellem intern revisor og ekstern revisor - og ISA 610 v/ Dorthe Tolborg Regional Chief Auditor, Codan Group og formand for IIA DK RSA REPRESENTATION WORLD WIDE 300

Læs mere

Improving data services by creating a question database. Nanna Floor Clausen Danish Data Archives

Improving data services by creating a question database. Nanna Floor Clausen Danish Data Archives Improving data services by creating a question database Nanna Floor Clausen Danish Data Archives Background Pressure on the students Decrease in response rates The users want more Why a question database?

Læs mere

The River Underground, Additional Work

The River Underground, Additional Work 39 (104) The River Underground, Additional Work The River Underground Crosswords Across 1 Another word for "hard to cope with", "unendurable", "insufferable" (10) 5 Another word for "think", "believe",

Læs mere

Den nye Eurocode EC Geotenikerdagen Morten S. Rasmussen

Den nye Eurocode EC Geotenikerdagen Morten S. Rasmussen Den nye Eurocode EC1997-1 Geotenikerdagen Morten S. Rasmussen UDFORDRINGER VED EC 1997-1 HVAD SKAL VI RUNDE - OPBYGNINGEN AF DE NYE EUROCODES - DE STØRSTE UDFORDRINGER - ER DER NOGET POSITIVT? 2 OPBYGNING

Læs mere

Barnets navn: Børnehave: Kommune: Barnets modersmål (kan være mere end et)

Barnets navn: Børnehave: Kommune: Barnets modersmål (kan være mere end et) Forældreskema Barnets navn: Børnehave: Kommune: Barnets modersmål (kan være mere end et) Barnets alder: år og måneder Barnet begyndte at lære dansk da det var år Søg at besvare disse spørgsmål så godt

Læs mere

Overview LINKING METRICS BACKLINKS TYPES. URL Rating Domain Rating Backlinks Referring Domains. Referring Pages 173. text 173. Total Backlinks 184

Overview LINKING METRICS BACKLINKS TYPES. URL Rating Domain Rating Backlinks Referring Domains. Referring Pages 173. text 173. Total Backlinks 184 Overview URL Rating Domain Rating Backlinks Referring Domains 12 35 184 11 0 0 0 0 LINKING METRICS Referring Pages 173 Total Backlinks 184 Crawled Pages 1 Referring IPs 9 Referring Subnets 8 Referring

Læs mere

Handout 1: Eksamensspørgsmål

Handout 1: Eksamensspørgsmål Handout 1: Eksamensspørgsmål Denne vejledning er udfærdiget på grundlag af Peter Bakkers vejledning til jeres eksamensspørgsmål. Hvis der skulle forekomme afvigelser fra Peter Bakkers vejledning, er det

Læs mere

CS 4390/5387 SOFTWARE V&V LECTURE 5 BLACK-BOX TESTING - 2

CS 4390/5387 SOFTWARE V&V LECTURE 5 BLACK-BOX TESTING - 2 1 CS 4390/5387 SOFTWARE V&V LECTURE 5 BLACK-BOX TESTING - 2 Outline 2 HW Solution Exercise (Equivalence Class Testing) Exercise (Decision Table Testing) Pairwise Testing Exercise (Pairwise Testing) 1 Homework

Læs mere

Skriftlig Eksamen Diskret matematik med anvendelser (DM72)

Skriftlig Eksamen Diskret matematik med anvendelser (DM72) Skriftlig Eksamen Diskret matematik med anvendelser (DM72) Institut for Matematik & Datalogi Syddansk Universitet, Odense Onsdag den 18. januar 2006 Alle sædvanlige hjælpemidler (lærebøger, notater etc.),

Læs mere

Forslag til implementering af ResearcherID og ORCID på SCIENCE

Forslag til implementering af ResearcherID og ORCID på SCIENCE SCIENCE Forskningsdokumentation Forslag til implementering af ResearcherID og ORCID på SCIENCE SFU 12.03.14 Forslag til implementering af ResearcherID og ORCID på SCIENCE Hvad er WoS s ResearcherID? Hvad

Læs mere

Sports journalism in the sporting landscape

Sports journalism in the sporting landscape Sports journalism in the sporting landscape - Blind spots of the journalists Foto: Bjørn Giesenbauer/Flickr Play the Game 2013 Aarhus, 30 October 2013 Ditte Toft Danish Institute for Sports Studies/Play

Læs mere

WIKI & Lady Avenue New B2B shop

WIKI & Lady Avenue New B2B shop WIKI & Lady Avenue New B2B shop Login Login: You need a personal username and password Du skal bruge et personligt username og password Only Recommended Retail Prices Viser kun vejl.priser! Bestilling

Læs mere

Trolling Master Bornholm 2014

Trolling Master Bornholm 2014 Trolling Master Bornholm 2014 (English version further down) Så er ballet åbnet, 16,64 kg: Det er Kim Christiansen, som i mange år også har deltaget i TMB, der tirsdag landede denne laks. Den måler 120

Læs mere

Using SL-RAT to Reduce SSOs

Using SL-RAT to Reduce SSOs Using SL-RAT to Reduce SSOs Daniel R. Murphy, P.E. Lindsey L. Donbavand November 17, 2016 Presentation Outline Background Overview of Acoustic Inspection Approach Results Conclusion 2 Background Sanitary

Læs mere

The GAssist Pittsburgh Learning Classifier System. Dr. J. Bacardit, N. Krasnogor G53BIO - Bioinformatics

The GAssist Pittsburgh Learning Classifier System. Dr. J. Bacardit, N. Krasnogor G53BIO - Bioinformatics The GAssist Pittsburgh Learning Classifier System Dr. J. Bacardit, N. Krasnogor G53BIO - Outline bioinformatics Summary and future directions Objectives of GAssist GAssist [Bacardit, 04] is a Pittsburgh

Læs mere

Exercise 6.14 Linearly independent vectors are also affinely independent.

Exercise 6.14 Linearly independent vectors are also affinely independent. Affine sets Linear Inequality Systems Definition 6.12 The vectors v 1, v 2,..., v k are affinely independent if v 2 v 1,..., v k v 1 is linearly independent; affinely dependent, otherwise. We first check

Læs mere

Dendrokronologisk Laboratorium

Dendrokronologisk Laboratorium Dendrokronologisk Laboratorium NNU rapport 8, 2001 BRO OVER SKJERN Å, RINGKØBING AMT Skjern Å Projektet/Oxbøl Statsskovdistrikt/RAS. Indsendt af Torben Egeberg og Mogens Schou Jørgensen. Undersøgt af Aoife

Læs mere

StarWars-videointro. Start din video på den nørdede måde! Version: August 2012

StarWars-videointro. Start din video på den nørdede måde! Version: August 2012 StarWars-videointro Start din video på den nørdede måde! Version: August 2012 Indholdsfortegnelse StarWars-effekt til videointro!...4 Hent programmet...4 Indtast din tekst...5 Export til film...6 Avanceret

Læs mere

Developing a tool for searching and learning. - the potential of an enriched end user thesaurus

Developing a tool for searching and learning. - the potential of an enriched end user thesaurus Developing a tool for searching and learning - the potential of an enriched end user thesaurus The domain study Focus area The domain of EU EU as a practical oriented domain and not as a scientific domain.

Læs mere

Appendix 1: Interview guide Maria og Kristian Lundgaard-Karlshøj, Ausumgaard

Appendix 1: Interview guide Maria og Kristian Lundgaard-Karlshøj, Ausumgaard Appendix 1: Interview guide Maria og Kristian Lundgaard-Karlshøj, Ausumgaard Fortæl om Ausumgaard s historie Der er hele tiden snak om værdier, men hvad er det for nogle værdier? uddyb forklar definer

Læs mere

The complete construction for copying a segment, AB, is shown above. Describe each stage of the process.

The complete construction for copying a segment, AB, is shown above. Describe each stage of the process. A a compass, a straightedge, a ruler, patty paper B C A Stage 1 Stage 2 B C D Stage 3 The complete construction for copying a segment, AB, is shown above. Describe each stage of the process. Use a ruler

Læs mere

Trolling Master Bornholm 2015

Trolling Master Bornholm 2015 Trolling Master Bornholm 2015 (English version further down) Panorama billede fra starten den første dag i 2014 Michael Koldtoft fra Trolling Centrum har brugt lidt tid på at arbejde med billederne fra

Læs mere

Trolling Master Bornholm 2013

Trolling Master Bornholm 2013 Trolling Master Bornholm 2013 (English version further down) Tilmeldingerne til 2013 I dag nåede vi op på 85 tilmeldte både. Det er stadig lidt lavere end samme tidspunkt sidste år. Tilmeldingen er åben

Læs mere

TM4 Central Station. User Manual / brugervejledning K2070-EU. Tel Fax

TM4 Central Station. User Manual / brugervejledning K2070-EU. Tel Fax TM4 Central Station User Manual / brugervejledning K2070-EU STT Condigi A/S Niels Bohrs Vej 42, Stilling 8660 Skanderborg Denmark Tel. +45 87 93 50 00 Fax. +45 87 93 50 10 info@sttcondigi.com www.sttcondigi.com

Læs mere

Kvant Eksamen December 2010 3 timer med hjælpemidler. 1 Hvad er en continuous variable? Giv 2 illustrationer.

Kvant Eksamen December 2010 3 timer med hjælpemidler. 1 Hvad er en continuous variable? Giv 2 illustrationer. Kvant Eksamen December 2010 3 timer med hjælpemidler 1 Hvad er en continuous variable? Giv 2 illustrationer. What is a continuous variable? Give two illustrations. 2 Hvorfor kan man bedre drage konklusioner

Læs mere

Strategic Capital ApS has requested Danionics A/S to make the following announcement prior to the annual general meeting on 23 April 2013:

Strategic Capital ApS has requested Danionics A/S to make the following announcement prior to the annual general meeting on 23 April 2013: Copenhagen, 23 April 2013 Announcement No. 9/2013 Danionics A/S Dr. Tværgade 9, 1. DK 1302 Copenhagen K, Denmark Tel: +45 88 91 98 70 Fax: +45 88 91 98 01 E-mail: investor@danionics.dk Website: www.danionics.dk

Læs mere

How Long Is an Hour? Family Note HOME LINK 8 2

How Long Is an Hour? Family Note HOME LINK 8 2 8 2 How Long Is an Hour? The concept of passing time is difficult for young children. Hours, minutes, and seconds are confusing; children usually do not have a good sense of how long each time interval

Læs mere

Info og krav til grupper med motorkøjetøjer

Info og krav til grupper med motorkøjetøjer Info og krav til grupper med motorkøjetøjer (English version, see page 4) GENERELT - FOR ALLE TYPER KØRETØJER ØJER GODT MILJØ FOR ALLE Vi ønsker at paraden er en god oplevelse for alle deltagere og tilskuere,

Læs mere

Black Jack --- Review. Spring 2012

Black Jack --- Review. Spring 2012 Black Jack --- Review Spring 2012 Simulation Simulation can solve real-world problems by modeling realworld processes to provide otherwise unobtainable information. Computer simulation is used to predict

Læs mere

BILAG 8.1.B TIL VEDTÆGTER FOR EXHIBIT 8.1.B TO THE ARTICLES OF ASSOCIATION FOR

BILAG 8.1.B TIL VEDTÆGTER FOR EXHIBIT 8.1.B TO THE ARTICLES OF ASSOCIATION FOR BILAG 8.1.B TIL VEDTÆGTER FOR ZEALAND PHARMA A/S EXHIBIT 8.1.B TO THE ARTICLES OF ASSOCIATION FOR ZEALAND PHARMA A/S INDHOLDSFORTEGNELSE/TABLE OF CONTENTS 1 FORMÅL... 3 1 PURPOSE... 3 2 TILDELING AF WARRANTS...

Læs mere

Design til digitale kommunikationsplatforme-f2013

Design til digitale kommunikationsplatforme-f2013 E-travellbook Design til digitale kommunikationsplatforme-f2013 ITU 22.05.2013 Dreamers Lana Grunwald - svetlana.grunwald@gmail.com Iya Murash-Millo - iyam@itu.dk Hiwa Mansurbeg - hiwm@itu.dk Jørgen K.

Læs mere

Vejledning til Sundhedsprocenten og Sundhedstjek

Vejledning til Sundhedsprocenten og Sundhedstjek English version below Vejledning til Sundhedsprocenten og Sundhedstjek Udfyld Sundhedsprocenten Sæt mål og lav en handlingsplan Book tid til Sundhedstjek Log ind på www.falckhealthcare.dk/novo Har du problemer

Læs mere

Mandara. PebbleCreek. Tradition Series. 1,884 sq. ft robson.com. Exterior Design A. Exterior Design B.

Mandara. PebbleCreek. Tradition Series. 1,884 sq. ft robson.com. Exterior Design A. Exterior Design B. Mandara 1,884 sq. ft. Tradition Series Exterior Design A Exterior Design B Exterior Design C Exterior Design D 623.935.6700 robson.com Tradition Series Exterior Design A w/opt. Golf Cart Garage Exterior

Læs mere

Digitaliseringsstyrelsen

Digitaliseringsstyrelsen NemLog-in 29-05-2018 INTERNAL USE Indholdsfortegnelse 1 NEMLOG-IN-LØSNINGER GØRES SIKRERE... 3 1.1 TJENESTEUDBYDERE SKAL FORBEREDE DERES LØSNINGER... 3 1.2 HVIS LØSNINGEN IKKE FORBEREDES... 3 2 VEJLEDNING

Læs mere

MSE PRESENTATION 2. Presented by Srunokshi.Kaniyur.Prema. Neelakantan Major Professor Dr. Torben Amtoft

MSE PRESENTATION 2. Presented by Srunokshi.Kaniyur.Prema. Neelakantan Major Professor Dr. Torben Amtoft CAPABILITY CONTROL LIST MSE PRESENTATION 2 Presented by Srunokshi.Kaniyur.Prema. Neelakantan Major Professor Dr. Torben Amtoft PRESENTATION OUTLINE Action items from phase 1 presentation tti Architecture

Læs mere

Kort A. Tidsbegrænset EF/EØS-opholdsbevis (anvendes til EF/EØS-statsborgere) (Card A. Temporary EU/EEA residence permit used for EU/EEA nationals)

Kort A. Tidsbegrænset EF/EØS-opholdsbevis (anvendes til EF/EØS-statsborgere) (Card A. Temporary EU/EEA residence permit used for EU/EEA nationals) DENMARK Residence cards EF/EØS opholdskort (EU/EEA residence card) (title on card) Kort A. Tidsbegrænset EF/EØS-opholdsbevis (anvendes til EF/EØS-statsborgere) (Card A. Temporary EU/EEA residence permit

Læs mere

Remember the Ship, Additional Work

Remember the Ship, Additional Work 51 (104) Remember the Ship, Additional Work Remember the Ship Crosswords Across 3 A prejudiced person who is intolerant of any opinions differing from his own (5) 4 Another word for language (6) 6 The

Læs mere

Domestic violence - violence against women by men

Domestic violence - violence against women by men ICASS 22 26 august 2008 Nuuk Domestic violence - violence against women by men Mariekathrine Poppel Email: mkp@ii.uni.gl Ilisimatusarfik University of Greenland Violence: : a concern in the Arctic? Artic

Læs mere

Listen Mr Oxford Don, Additional Work

Listen Mr Oxford Don, Additional Work 57 (104) Listen Mr Oxford Don, Additional Work Listen Mr Oxford Don Crosswords Across 1 Attack someone physically or emotionally (7) 6 Someone who helps another person commit a crime (9) 7 Rob at gunpoint

Læs mere