
8th
International Conference of the
African
Association for Lexicography
AFRILEX
2003
Bilingual
Dictionaries
Programme
& Abstracts
To
front and back cover of this booklet (pdf 1.198KB)
|
Dates: |
7-9 July 2003 |
|
Host: |
Department of Germanic & Romance Languages, University
of Namibia, Windhoek, Namibia |
|
Local Conference Organiser: |
Mr. Herman Beyer |
|
Abstract Reviewers: |
Prof. Rufus H. Gouws, Prof. D.J. Prinsloo, Dr.
Elsabé Taljard, Ms. Anneleen Van der Veken |
|
Programme Committee: |
Mr. Herman Beyer, Mr. Gilles-Maurice de Schryver,
Prof. D.J. Prinsloo |
edited
by
Gilles-Maurice
de Schryver
Organiser:
AFRILEX
Copyright
© 2003 by the African Association for Lexicography
ISBN
0-620-30795-1
Pretoria:
(SF)2 Press
Cover
Screenshots by David Joffe: “From TshwaneLex to Online Dictionary”
(david.joffe@africanlanguages.com
| http://africanlanguages.com)
Cover
Artwork by Giovanni Plozner
(info@giovanniplozner.com | http://www.giovanniplozner.com)
A FEW WORDS FROM THE CHAIRPERSON
Afrilex welcomes you
to our 8th International Conference which also marks our 8th
year of existence. We are proud to be a member of the international –lex
family and to present you with this Conference Abstract Booklet, once again
meticulously compiled and edited by Gilles-Maurice de Schryver.
I wish to thank you
for attending the Conference and for your loyal support for our Association and
lexicography in Africa.
Afrilex greetings
D.J. Prinsloo
§
Ulrich Heid — The Handling of
Collocations and Idiomatic Multiword Expressions: From Corpora to Dictionaries
§
Rufus H. Gouws — Outer Texts in
Bilingual Dictionaries
§
Gwyneth Fox — Corpus Research and
Lexicography
§
Thierry Afane Otsaga — Hybrid
Dictionaries – The Future of Lexicography
§
Mariëtta Alberts — Lexicography and Terminology Training at University Level
§
Herman L. Beyer — Can We Quantify the Effects of Dictionary Use?
§
Emmanuel Chabata —
Interviewer-Interviewee Interaction in Oral Interviews
§
Gilles-Maurice de Schryver — Concurrent Over-
and Under-treatment in Dictionaries — The Woordeboek van die Afrikaanse Taal
as a case in point
§
James D. Emejulu — Revisiting
Equivalence in Bilingual Lexicography
§
James D. Emejulu, Yolande Nzang-Bie,
Pierre Ondo-Mebiame & D. Franck Idiata — Le
rôle des dictionnaires bilingues dans le développement des langues Gabonaises: Le cas du fang
§
Rachélle Gauton — Bilingual Dictionaries, the Lexicographer
and the Translator
§
Wilfrid H.G. Haacke — A Khoekhoegowab Dictionary in the Making: Some
Lexicographic Considerations in Retrospect
§
Samukele Hadebe — The Proposed Ndebele – Shona Dictionary: Prospects and
Challenges
§
Kathy Kavanagh — English
for New South African Bilingual Dictionaries
§
Langa Khumalo — From a General to an Advanced Ndebele Dictionary: An
Outline
§
John M. Lubinda — The Incorporation and Handling of Metaphorical or
Figurative Meaning in Bilingual Dictionaries
§
Matete Madiba, Lorna Mphahlele & Matlakala Kganyago — Capturing Cultural Glossaries. Case
Study II: Medical Terms
§
Mandlenkosi Maphosa — The Users’ Perspectives on Isichazamazwi SeSiNdebele
§
Webster Mavhu — Bilingual versus Monolingual: A Comparative Analysis of
Two Trends in Shona Lexicography
§
Gift Mheta — The
Impact of Translation Activities on the Development of African Languages in
Multilingual Societies: Shona – Ndebele – English Musical Terms Dictionary, a
Case Study
§
Linkie Mohlala, Gilles-Maurice de Schryver & Rachélle Gauton — The Lexicographic Treatment of the Feminine/Augmentative
Suffix ‑kazi in isiZulu
§
Nomalanga Mpofu — The ALRI
Experience in the Compilation of a Dictionary of Biomedical Terms
§
Cornelias Ncube — Language Development or Language Corruption: A Case of
Loanwords in Isichazamazwi SeSiNdebele
§
Salmina Nong & M.P. Mogodi — The Lexicographic Treatment of the Demonstrative
Copulative in Sesotho sa Leboa – An Exercise in Multiple Cross-referencing
§
Thapelo J. Otlogetswe — Challenges to Representative and
Balanced Corpora for African Lexicography
§
Annél Otto & Nerina Bosman — The User Perspective: Bible Reference
Resources as Example
§
D.J. Prinsloo — The Lemmatisation of
Adverbs in Northern Sotho
§
M.P. Rakgokong — Are the Setswana Mockery Words that
Objectionable?
§
Mariza Steyn & Liezl Gouws — Woordeboek sonder
Grense: A Typological and Communicative Bridge
§
P.H. Swanepoel — Dictionary Tailoring, SL Lexical
Acquisition and Computer-Assisted Language Learning: The LINC Approach
§
Elsabé Taljard — On the Semi-automatic Extraction of Definitional
Information: A Case Study for Northern Sotho
§
Dirk J. van Schalkwyk — Language Variation and the
Lexicographer
Programme
AFRILEX 2003
Keynote papers
The Handling of Collocations and Idiomatic Multiword Expressions: From
Corpora to Dictionaries
Ulrich Heid
Institut für maschinelle Sprachverarbeitung – Computerlinguistik,
Universität Stuttgart, Germany
Corpus
query tools, such as WordSmith Tools or Qwick (Birmingham University), come
with a function to extract collocations of a given word from a corpus. As a
result, they provide lists of word pairs, often together with a measure
indicating how much the two elements belong together. Already years ago, a
computational linguist told me in a discussion that, with these tool functions,
the problem of collocations in corpus lexicography was solved. This talk is
intended to show why this is not the case.
The
abovementioned collocation tools are based on statistical association measures
that determine statistically significant co-occurrences of words. Examples of
such association measures include the t-test (Church & Hanks 1992), the
log-likelihood ratio test (Dunning 1993), the Mutual Information measure, etc.
They are all used to reorder lists of collocation candidates, possibly
extracted beforehand by means of corpus query (e.g. for nouns and the verbs
these nouns are objects of, as in “pay attention”, “ask a question”, etc.).
Examples and a few well-known problems of the underlying statistics will be
discussed; for example, Mutual Information unduly privileges low frequency
words, and log-likelihood seems to be good in particular for the upper half of
the frequency spectrum, however being quite dependent on frequency.
An analysis of some German and English
data obtained in this way from corpora will show that the results of the
statistical procedures, even though to some extent useful for lexicographic
work, are far from homogeneous: they typically include a mixture of
collocations and idiomatic word groups, as well as of trivial,
lexicographically irrelevant, word combinations which may, for example, be
artefacts of the corpus under analysis.
We thus need additional linguistic
criteria to further classify the material, but also, more importantly, to
discover additional morphosyntactic, syntactic and semantic properties of the
word combinations identified so far only in terms of the lexemes involved. It
is not sufficient to know that “pay” and “attention” go together, we must also
know that “pay attention” has no article; or that “former” and “time” typically
come as a plural expression, often with a preposition: “in former times”. These
aspects contribute to the partial idiomatisation of collocations, and a learner
of a foreign language must memorise them along with the collocation. For German
and English noun+verb-combinations, an attempt will be made to provide a
classified list of phenomena which need to be kept track of, beyond lexical
co-occurrence, to make up for a detailed description of the respective
multiword items. The claim we would like to make is that collocations and
idiomatic multiword expressions must be lexicographically described in as much
detail as any single-word lemma; this means that information about the
components of the collocation, as well as about the collocation as a whole must
be given with respect to morphosyntactic, syntactic (e.g. construction),
semantic and pragmatic (e.g. style/register, frequency) properties.
Furthermore, collocations tend to be combined, such that texts often include
significant triples or quadruples of words (e.g. (pay+attention) +
(careful+attention): pay careful attention). Along with the phenomena, a few
suggestions for their corpus-based acquisition will be made (Heid &
Zinsmeiser 2003).
In the third part, the question of the
lexicographic data presentation will be discussed. Beyond the question of where
to lemmatise collocations and idiomatic multiword groups, the detailed
phenomena discussed above make the writing of an article somewhat more
difficult, as they need to be kept track of. We look at this problem with
bilingual (active) dictionaries in mind, printed as well as electronic. Inspiration
for the article layout may come from experimental dictionaries such as
Mel’cuk’s Explanatory Combinatorial Dictionaries, but also from printed
dictionaries for general users, such as the Van Dale series of bilingual
dictionaries in the Netherlands. Sample entries in different “styles” will be
briefly discussed.
Outer Texts in Bilingual Dictionaries
Rufus H. Gouws
Department of Afrikaans and Dutch, University of Stellenbosch, South
Africa
Metalexicographic
research of the recent years has been characterised by a growing interest in
and focus on various aspects regarding the structure of dictionaries. In this
regard both the mutual features and dictionary-specific features have received
attention. Dictionary research no longer only includes attempts to describe and
analyse the contents of dictionaries and the different data types on offer, the
different structural components of dictionaries also fall within the scope of
this field of research. As a carrier of text types a dictionary is not only
regarded as a source of information displaying a variety of data types in the
central list. A new emphasis deviates the attention from a central list bias
towards a more inclusive frame structure approach. This approach works with the
assumption that the central list is complemented by front and back matter
texts, constituting the outer texts of a dictionary.
Utilising the frame structure approach this paper
focuses on the use of outer texts in bilingual dictionaries. The distinction
between integrated and unintegrated outer texts is maintained and both these
text types, their purpose and the role they play in devising the data
distribution structure of a dictionary are examined. In using integrated outer
texts it is shown that the data distribution does not have to focus exclusively
on the default article in the central list although article stretches still
accommodate the most typical data categories directed at the lemmata as guiding
elements of articles and primary treatment units. It is shown how an
interactive relation between the integrated outer texts and the central list
can achieve an optimal realisation of the genuine purpose of a bilingual
dictionary and can enhance the quality of dictionary consultation procedures.
As examples of unintegrated outer texts the use of
alphabetically ordered equivalent registers, the listing of items representing
the lemmata included in complex and synopsis articles as well as additional
pedagogical data will be discussed. It is also shown how back matter texts can
add a typological hybrid character to a dictionary by using alternative
ordering systems, e.g. a thematic ordering as opposed to the alphabetical
ordering of the central list. The way in which outer texts can ensure that a
dictionary has a poly-accessible character that meets the needs of a
user-driven project is also discussed. Looking at the user and usage situation
the role of dictionary functions in the planning of the outer texts may never
be underestimated and various aspects of the theory of lexicographic functions
come to the fore in the discussion.
The successful use of outer texts demands a new look at
the data distribution structure of bilingual dictionaries. Emphasis is yet
again placed on the importance that each dictionary project should include a
well-devised dictionary plan.
In this paper a dictionary is seen as a comprehensive
container of knowledge and suggestions are made to improve the quality of the
access structure to ensure an optimal retrieval of information by the intended
target user.
Corpus Research and Lexicography
Gwyneth Fox
Macmillan Education: Publisher, Dictionaries
Work
with corpora over the past 20 years has shown us a great deal about how we use
English. In particular, there have been many revelations about the ways in
which vocabulary patterns are surprisingly predictable, and these findings are
now being reflected in learners’ dictionaries. This means that such
dictionaries are probably the best record we have of the way in which English
is now being used. Many examples will be given to justify this statement. But
there is no reason why corpus research should not influence bilingual
lexicography more than it presently seems to.
People are fascinated
by language. And researchers have been studying it for centuries. But it is
only in the past twenty years or so that we can be sure that the statements we make
about the language are accurate. That is because the advent of computers has
allowed us to build corpora, as large or as small as are appropriate for our
particular needs, and analyse them for frequency, grammar, vocabulary,
pragmatics, discourse functions, and so on. Perhaps the two areas where we have
learned most are those of frequency and vocabulary.
Although we always knew that some words
were more frequent than others, we now know which words these are, and how
often they are used and in what contexts. This must be important information
for learners of a language: they need to know which words are worth expending
effort on!
We also realise that it is not enough
just to look at words, however frequent they might be, in isolation.
Collocation and colligation patterns stand out in the data, and force us to
reassess the way in which we describe words, both in the classroom and in
dictionaries. Collocation patterns range from the relatively fixed and
difficult to decode, as in idioms and proverbs, through binomials and
trinomials, through chunking, right down to those that are weak and perhaps not
worth mentioning. The same is true of colligation. The phraseology of the
language is much less random, much more predictable than we ever imagined.
Another vocabulary ‘discovery’ is that
of semantic prosody. Why is it that some words have attracted to them other
words, either positive or negative, so that it is almost impossible to use them
in any other way? Some of these words are obvious, others much less so. How
could a learner know about their prosody if it were not pointed out to them?
Corpus findings are now well known, and
are expressed at their best in the new breed of learners’ dictionaries produced
in the UK in the past fifteen or so years. This makes these dictionaries the
best, most up-to-date, most accurate record of English as it is presently being
used. Some bilingual dictionaries are now being compiled with the benefit of
two, often parallel, corpora; but it seems to me that they are not yet as good
(or as helpful) descriptions of the language as you find in monolingual
learners’ dictionaries.
Parallel sessions
Hybrid
Dictionaries – The Future of Lexicography
Thierry Afane Otsaga
Department of Afrikaans and Dutch, Stellenbosch,
South Africa
Dictionaries have been
compiled for several thousand years. Their need arose when it became more
difficult to read and understand religious texts. Therefore, dictionaries were
invented in order to assist in the understanding of these texts that were
actually written in a language that was no longer understood by the interested
people. Nowadays, dictionaries are still produced because certain human
linguistic and knowledge needs are observed in society and they are compiled to
satisfy these needs. This basic characteristic is the main purpose of
dictionaries.
In order to always satisfy user needs,
lexicographers have been trying to compile different types of dictionaries,
according to different aspects: the users’ language competences, users’ general
culture and knowledge, users’ respective field subjects, users’ translation
needs, etc. In general, they have to take into account the objectives of users
when these users are using dictionaries. In that regard, various types of
dictionaries have been compiled to be used by a specific target user group.
Indeed, some dictionaries are directed at the extra-linguistic features of the
items treated (encyclopaedic dictionaries), while other dictionaries focus on
the linguistic and pragmatic aspects (linguistic dictionaries). Some
dictionaries focus on the origin, history and development of the treated
language (diachronic dictionaries), while still others focus on the lexicon of
a language at a specific time in its development (synchronic dictionaries). In
the category of linguistic dictionaries, monolingual dictionaries can aim at a
scholar approach (school dictionaries), a learning approach (learners’
dictionaries), a normative approach (standard dictionaries), or a comprehensive
approach (comprehensive dictionaries). Conversely, bilingual or multilingual
dictionaries can be compiled for a polyfunctional purpose (polyfunctional
dictionaries), they can also be monoscopal or biscopal. All these various types
of dictionaries were directed by the necessity to satisfy users’ needs.
The main objective of
lexicographical works is to satisfy the needs of the users. When dealing with
the methodology and even with the planning of a dictionary, one must first
define the target user; otherwise the compilation will not be efficient.
However, in every lexicographical work the main interest is on the dictionary
user. In modern lexicography, the role and the place of the user is more and
more taken into account. The users are a great lobby and the publishing houses
know it so well: even if a dictionary is compiled within a good methodology, if
a user does not find the information he/she needs, this dictionary will not be
sold or used. Thus, the user appears to be the focal point on which each
element of the lexicographical process focuses. Because user needs are
increasing and because most people want knowledge regarding different aspects
of life, it is becoming increasingly difficult to satisfy user needs in one
specific type of dictionary. At the same time, users do not want to spend more
time and money by buying different dictionaries according to what they are
looking for. The ideal solution for them could be to find most information they
need in one single dictionary. On the other hand, it is important to specify
that it is not possible to satisfy all the user needs in one dictionary, even
in a multi-volume dictionary. Yet the lexicographer must try to come as close
as possible to satisfying user needs. For that reason, the only solution could
be the compilation of hybrid dictionaries. In fact, in modern-day lexicography
hybrid dictionaries will be the solution of the future that will allow
lexicographers to give to the users what there are looking for in a dictionary.
In that regard, some dictionaries will not have one specific purpose, but could
include two, three, four, and even five functions. A bilingual dictionary for
instance will not only give translation equivalents of lemmas, it will also
give paraphrases of meaning in order to allow the users to utilise the same
dictionary to solve not only their problem of translation, but also to be able
to improve their knowledge in the same language. The main purpose of this paper
is to show that as a result of new and increasing user needs, the best way for
future lexicography will be the compilation of hybrid dictionaries.
Dictionaries focusing on one unique and specific aspect will no longer satisfy
a public who needs to have knowledge about various aspects and domains.
Lexicography
and Terminology Training at University Level
Mariëtta Alberts
Manager:
Lexicography and Terminology Development, PanSALB, South Africa
The multilingual dispensation
creates job opportunities for language practitioners. These language
practitioners need training in various aspects regarding the language practice
since lexicography, terminography, translation and editing (to name but a few)
are practices that need highly skilled and knowledgeable practitioners.
Several
of the focus areas of the Pan South African Language Board (PanSALB)
concentrate to a certain extent on language development, such as terminology
development, lexicography or aspects like translation and interpreting
services. PanSALB is aware that all these language practices need skilled and
highly trained personnel.
The
Lexicography and Terminology Development (L&TD) focus area deals with the
eleven National Lexicography Units (NLUs) and one national terminology office.
The eleven national lexicography units were established and each is situated at
a tertiary institution in the geolinguistic area where most of the
mother-tongue speakers of the specific language are found. Unfortunately, there
are only a few trained lexicographers available to work at these units. The
only national terminology office in the country, the Terminology Coordination
Section (TCS) is part of the National Language Service (NLS), Department of
Arts and Culture (DAC). The terminologists receive in-house training on
terminological and terminographical principles and practice. It is of the
utmost importance to train language practitioners and students to be able to
compile general as well as technical dictionaries for communication purposes.
The
value of lexicography and terminology training cannot be stressed enough. The
need might even be greater in South Africa than in other countries given the
multilingual clause in the Constitution that provides for eleven official South
African languages. Multilingual general as well as technical dictionaries are
needed for proper communication between linguistic communities. Presently there
are very few trained lexicographers and terminologists, especially in the
African languages. Language practitioners, who are going to work on
lexicographical or terminographical projects in future, need training as soon
as possible.
This paper addresses
the current situation regarding lexicography and terminology training.
Suggestions are made regarding the utilisation of Schools for Languages as
training venues for lexicography and terminology courses. The benefits for the
Schools of Languages are spelled out. The value to other departments and
faculties at the given university, the benefit to other students at other
universities in the country and worldwide and to language offices or language
units receives attention. The process as described would train students in the
theory, principles and practice of lexicography and terminology. It would be to
the advantage of the NLUs as well as the TCS and the to be established language
units to appoint trained personnel rather than to devote time on in-house
training. Production of general dictionaries as well as various technical
dictionaries would show progress.
The
various tertiary institutions such as the universities and technikons would
benefit because they would train students and there would be positive and
worthwhile outcomes.
The
Human Language Technology virtual network would benefit by receiving
multilingual general words and multilingual, polythematic terms into its
database for dissemination to linguistic communities.
The
language community would benefit since they would have words and terms
available for better communication. Minority languages would be developed to
become functional languages in the higher echelons of science and technology.
Finally, the South African languages would be available as functional world
languages on the Internet.
Can We Quantify
the Effects of Dictionary Use?
Herman L. Beyer
Department of
Germanic & Romance Languages, University of Namibia, Windhoek, Namibia
This paper aims to
give an overview of the empiric research into the possibility of quantifying the
effects of dictionary use among school learners, which has been conducted as a
pilot study at the University of Namibia. The initial processes and results are
explained, which provides insight into how the project may be amended to
continue meaningfully.
The first instances of
data captured in this project took place in 1997 while the researcher was a
language teacher in Swakopmund, employed by the Ministry of Basic Education and
Culture of Namibia. The working hypothesis was to determine whether the use of
dictionaries by school learners would result in improved linguistic
performance. One linguistic skill, that of spelling, was chosen for the
experiment. The respondents comprised of two classes of Grade 11 learners who
took Afrikaans as a first language. One class group was labeled the test group,
the other the control group. Both groups were given a series of four
unannounced spelling tests, the intervals ranging from three days to as much as
two months. Each test consisted of the same 25 items, chosen on the basis of
the potential spelling difficulties they might pose for learners. The learners
were not informed that the test would be repeated. They were, however, on each
occasion advised that the tests did not contribute to their continuous assessment
mark and were not designed to measure any aspect of intelligence. By doing
this, it was hoped that conditions resembling as closely as possible to normal
class conditions could be created.
The first spelling test was written by
both groups under similar conditions: normal test conditions without the
benefit of a dictionary.
During
the second test each member of the test was provided with a dictionary on
his/her desk. The respondents were given the freedom to look up any item in the
dictionary to make sure of its spelling, provided that they would indicate
dictionary use. This would enable the researcher to identify those items that a
particular respondent chose to look up. The control group wrote the second test
under conditions identical to those during the first test, i.e. without the
benefit of a dictionary. Unlike the test group, however, the control group
members were given immediate feedback on their tests by having them marked
after exchanging the scripts among the respondent (i.e. a respondent would not
mark his/her own test). Respondents were instructed to clearly indicate
mistakes on their fellow respondents’ scripts and to write down the correct
form in full each time. After the respondents received their tests back, they
were given about 30 seconds to take a look at the results, including the
corrections made by their fellow respondents. The test group was given no
feedback of any nature on their tests.
The third and fourth tests were conducted
under the same conditions as the first, i.e. normal test conditions without the
benefit of a dictionary.
This experimental
procedure provided the researcher with extensive data, from which it is hoped
the following questions could be approached with quantitative support:
·
Does a respondent who looks up a word for
spelling purposes remember its correct spelling later? If yes, for how long? If
no, are any consistencies identifiable that may allow us insight into the
reasons for the perceived failure to learn and perhaps into spelling
rehabilitation?
·
Is a respondent who looks up a word for spelling
purposes more likely to remember its correct spelling than a respondent who
does not utilise a dictionary but who is provided with rehabilitative feedback in
the ‘traditional’ way? If yes, what is the role of the dictionary in this case?
If no, why has learning seemingly not taken place?
The above questions
underlie the basic research question that this project aims to address: Does
dictionary use result in quantifiable improved linguistic performance?
Interviewer-Interviewee
Interaction in Oral Interviews
Emmanuel Chabata
African
Languages Research Institute, University of Zimbabwe, Harare, Zimbabwe
The intended
presentation will be an analysis of language used by an interviewer and that of
the interviewee during an oral interview. It will focus on the language of
penetration by the interviewer, that is, the language somebody usually uses
when he/she approaches a person for an interview in search of specific
information. It will also look at the respondent’s language when he/she
responds to different types of questions as well as that used by the people
concerned in their subsequent conversation. The presentation will also look at
the factors that may shape the respondent’s answers as well as the
interviewer’s follow-up questions. It will furthermore look at the element of
‘misfiring’ by either of the parties and its consequences.
The intended presentation
will focus on the strategies that an interviewer may use when he/she tries to
get information from a respondent. In doing this, the presenter will be guided
by the principle that each interview and each interview setting is different
and needs different skills and also that each situation involves expectations
and assumptions. He/She will also be guided by the assumption that whenever the
sender of information, in this case the interviewer, sends a question, he/she
hopes to be understood by the receiver/interviewee. However, the message may or
may not go through. To see whether it has gone through or not, one has to
assess the feedback that the sender gets. The presenter will also look at the
interviewer’s challenges, some of which will include respondent’s attitude
towards interviewer or the subject under discussion, the environment of the
interview, misfiring by the respondent as well as lack of knowledge by the
interviewee.
The presentation will also focus on
what an interviewer needs to do before he/she gets out to conduct an interview.
For example, the interviewer has to be thoroughly prepared. Being prepared
means that one has to formulate one’s questions before starting an interview.
One has to come up with questions that can incite the respondent to say what
he/she knows about the subject under discussion. For example, the questions
have to be structured in a way that is most effective and friendly.
Preparedness also entails getting the right person to interview. Depending on
the purpose or subject of the interview, the interviewer has to get somebody
who can supply the desired information. Besides knowing the subject, the person
has to be willing to spare time for answering questions. This is an important
dimension, especially given the fact that most people are usually busy. Thus,
one may expect to obtain better results if one interviews a person who is
prepared to give out information. The presenter will look at the common
strategies that interviewers usually use to cultivate interest in the respondent.
The presenter will also devote some
time to the qualifications one should possess as a good interviewer. For one to
be effective in getting information, one has to have the skill to ask
questions. The assumption to be adopted here is that a skilled interviewer is
better than one who is not. But this assumption also triggers a few questions.
For example, how does one become skilled? Is it through training or not? How
does personal character determine the end result?
In trying to understand exactly what
goes on between an interviewer and an interviewee, an analysis of their
respective body languages will be part of the investigation. In this case, the
assumption to be adopted is that verbal communication should match what is
implied by body language. The assumption is based on the fact that verbal and
non-verbal messages are intertwined, with the non-verbal symbols usually
complementing the verbal ones. However, the analyses to be made will not be
blind to the fact that sometimes non-verbal symbols may substitute verbal ones
and also that non-verbal symbols may be inconsistent with verbal ones. Body
language is considered important in oral interviews because it has a direct
impact on what either of the persons involved will say after observing the gesture(s).
The intended
presentation was inspired by the writer’s experiences as an oral interviewer
during data collection for the Shona linguistic corpus. As a result of this,
some of the illustrative examples to be used in the presentation will be drawn
from the oral Shona corpus, that is, from audiocassettes that were recorded
during the mentioned exercise. Other examples will come from general
observation, as well as from analyses of what one usually sees in television
interviews.
Concurrent
Over- and Under-treatment in Dictionaries — The Woordeboek van die
Afrikaanse Taal as a case in point
Gilles-Maurice de Schryver
Department of African
Languages and Cultures, Ghent University, Ghent, Belgium
Department of
African Languages, University of Pretoria, Pretoria, South Africa
In Prinsloo & De
Schryver (2002) a so-called multidimensional lexicographic Ruler was
introduced. With this powerful instrument measurements and predictions can be
made on various macro- and microstructural dictionary levels. Three levels
received thorough treatment so far, viz. considerations regarding the relative
size of each alphabetical stretch, the corresponding number of lemma signs, as
well as compilation-time aspects. In this paper the interplay between these
levels is studied with a focus on ‘moving’ average article length, and the
correlated aspects of inclusion versus omission of lemma signs.
In
its most basic form, a Ruler is simply an instrument to guide the relative
alphabetical breakdown in semasiological dictionaries. As such, each
alphabetical category is assigned a certain percentage, reflecting the relative
size of that category. Different languages, and even different types of
dictionaries for a specific language, have different Rulers. The Rulers
themselves are built from statistics derived from electronic corpora, as well
as from existing dictionary data. Just as physical rulers with which one
measures, they can be made as fine-grained as one wishes, by simply breaking
down the alphabetical categories further into smaller sections. Just as the
human rulers who govern us, a multidimensional lexicographic Ruler can be
called in to manage a project. To date, general-language Rulers for isiNdebele
(De Schryver 2002), Afrikaans (Prinsloo & De Schryver 2003a), and Sesotho
sa Leboa (Prinsloo & De Schryver 2003b), as well as for Tshivenda,
Xitsonga, Setswana and Sesotho have been designed.
During the presentation
it will be indicated that the very same Ruler for a specific language can now
also be used with regard to average article length. The value of this
new dimension can be successfully illustrated when one analyses the huge
multi-volume overall-descriptive Woordeboek van die Afrikaanse Taal
(WAT), in compilation for the past three-quarter century and published up to
the letter O (volume XI). Comparing
WAT with a so-called ‘Afikaans AO-Ruler’ immediately reveals extreme
inconsistencies with regard to average article length. For the letters A
and B, for instance, it is clear that both number of pages and number of
lemma signs are heavily under-treated in WAT. The under-treatment in terms of space
allocation, however, is much more severe, which results in a very low average
article length. Up to the letter J, the relative allocation to space is
always smaller than the relative allocation to the number of lemma signs. From K
onwards, a sudden reversal in this pattern occurs, and this remains so up to O.
Throughout K, both space allocation and number of lemma signs are extremely
heavily over-treated compared to the AO-Ruler. It should not come as a surprise
that, after having spent almost 30 years on the compilation of K,
the editors at WAT decided to drastically reconsider their compilation
strategies, and entered a ‘new’ era (cf. Botha 1994: 423). Page-wise the
compilers indeed moved closer to the AO-Ruler, with L and M
slightly above and N and O under the AO-Ruler. As far as the
number of lemma signs is concerned, however, these have been consistently
under-treated, with O an all-time low.
Although everyone will agree that the
compilation of K was unfortunate for WAT, a new negative trend might
have started with the completion of L and then M, where one
observes a concurrent over- and under-treatment, in terms of space
allocation and number of articles respectively. One should guard against the
temptation to move ever faster through the alphabet, as seems to be the case in
the last volume, where space allocation is now also under-treated, the number
of articles even more so, yet where this is masked by an ever-increasing
average article length.
In order to substantiate the latter
claim, an in-depth comparison between WAT and the desktop Verklarende
Handwoordeboek van die Afrikaanse Taal (HAT) will be presented for the
category O. Given that the entire HAT is smaller than the single
category O in WAT, it is logical to assume that every single lemma
sign in HAT should in principle also be entered in WAT. Upon comparison,
however, one has to conclude that as many as 499 o-initial lemma signs from HAT
have not been lemmatised in WAT. Just one of these 499 has been treated as a
sub-lemma in WAT, 40 can only be found as untreated sub-lemmas and 175 as
untreated run-ons, while 211 have not been lemmatised and do not occur anywhere
in the WAT text either. The remaining 72 have not been lemmatised in WAT –
either as lemmas, as sub-lemmas or as run-ons – despite the fact that those
very same items are used throughout the WAT text itself. Especially
problematic are those missing items that are not only highly frequent in a
10-million-word Afrikaans corpus, but are moreover cross-referred to
from other items in WAT. Numerous examples of such cases will be discussed.
Monitoring the compilation of especially a multi-volume dictionary project with an average article length Ruler is crucial if one is to avoid such major inconsistencies. A concurrent over- and under-treatment in t