Thursday, April 27, 2006

Brent Corrigan Heat Online



Comments on the sample of Questionnaire # 1.

We seem to be against the "Trojan horse." In our sample could fit the entire Hispanic world. Platitudinous definition by other utopian.

Colleagues, could spend a year (at least) trying to define the sample for our project.

We have many readings on the matter, which we put at your disposal. For the moment, not invade their electronic mailboxes readings. Send them to their request.

meantime, we ask you to consider:

1 º. Corpus: of the (many) definitions of "sample", we chose to offer the working group devoted to textual corpora EAGLES (Expert Advisory Group on Language Engineering Standards) (1996a: 4):
Corpus: "A collection of pieces of language selected and ordered That Are According To explicit linguistic criteria in order to be Used as a sample of the language. "

This definition contained three fundamental aspects to be considered in the definition of the corpora: a corpus should be composed of texts produced in real situations ("pieces of language") and the inclusion of texts that make up the corpus should be guided by a set of explicit linguistic criteria to ensure it can be used as a representative sample of a language. All scholars dedicated to corpus agree that these are fundamental in the creation and definition of the corpora, but certainly not cease to be controversial issues that have sometimes resulted in different positions. However

...

2 º. "More data is Better data". Following this premise, as opposed to earlier, we would not need to set limits for the moment, since we are talking many millions of English speakers. If we think in our study, which refers directly and exclusively use habits of certain words, the larger the volume of text to process more representative statistical indices will often appear.

3 º. A constant review: The build process should, according to Biber (1993: 256), be cyclical, so you must first build a pilot corpus to study their composition and decide which design parameters should be modified:
"A pilot corpus Should Be compiled first, Representing a broad range of variation Relatively But Also Some Representing a depth in registers ... Empirical Then Should Be Carried out research on this pilot corpus to confirm or modify the Various design parameters. Parts of this cycle Could Be Carried out in An Almost continuous fashion, with new texts Being Analyzed As They Become Available, But There Should Also Be Extensive Empirical discrete stages of investigation and review of the corpus design. "

This same concept cyclic the compilation of the corpus is reflected in the work of Tognini-Bonelli (1996b: 73), which states that the design of the corpus should be continuously reviewed and the results of the analysis of the data evaluated so that they can modify some of the criteria design, if the linguist as deemed necessary.

We add and emphasize the definition Omar Calabrese semiologist derived from Foucault and the natural sciences "set of documents that are necessary and sufficient to obtain a unified conclusion, becoming a" single body object'[...] there analysis in se e per se: it is always something constructed by the analyst, who in turn has to justify its operation by defining the rules of relevance and reciprocity which has built or accumulated "(Lezioni di semisimbolico). **************************************

0 comments:

Post a Comment