Main Page > Papers |
Linguistic and algorithmic aspects of
object extraction from natural language texts
Igor P. Kuznetsov Institute for Informatics Problems of the Russian Academy of
Sciences Moscow,
Russia igor-kuz@mtu-net.ru |
Elena B. Kozerenko
Institute for Informatics Problems of the Russian Academy of
Sciences Moscow, Russia kozerenko@mail.ru |
Andrew G. Matskevich Institute for Informatics Problems of the Russian Academy of
Sciences Moscow, Russia igor-kuz@mtu-net.ru |
Abstract A semantic linguistic processor which extracts the
objects and their links from natural language texts is considered. The paper
analyzes the experience of using the processor for formalization of texts in
various subject fields: criminal actions, mass media, terrorist
activities (in Russian and English). Peculiarities of the texts are taken into
account by linguistic knowledge of the processor: the system can be tuned to
various subject areas. We describe the use of this processor for text
formalization in different subject areas, such as criminology (summary of
incidents, accusatory conclusions, etc.), THE MEDIA (documents about terrorist
activities), personnel management (autobiographies, resume). Special features
of each problem area are examined: the collections of extracted objects, the means
for their identification, their connections, occurring contractions,
punctuation and special signs, specific character of language constructions,
etc. – all these special features were taken into account in the linguistic
knowledge development.
Keywords: semantics, natural language, linguistic processor,
knowledge engineering, data extraction
1 Introduction
A tremendous increase
of the documents flow, obtained by the users through different information
channels (including the Internet), requires new solutions. The large part of
such documents exists in the form of natural language texts (NL). In many cases
one cannot read and comprehend even the smallest portion of the factual
information available. The existing information means can render assistance, but
for this a preliminary formalization is required. At
the same time the majority of end users are people interested in specific
subject things. For example, a criminal inspector seeks to extract information
on important figurants, their places of residence, telephones, criminal events,
dates and other such facts; a personnel manager is interested in the
organizations, when and where a person worked and in what position. Other
people try to fish out from the media the information about the countries, important
persons, catastrophes, etc. We call this concrete information interesting for a
user information objects.
Hence follows the need
for constructing a new class of information systems, which would consider the
interests of the end user and be oriented at extracting information objects
from texts [1-3]. At present this problem is in the focus of attention of many
researchers and developers [4-19 ].
In this article a
class of such systems is presented, based on the use of special linguistic
processors (LP) and technology of knowledge bases (KB). Linguistic processors
are necessary for the deep processing of texts with the development of information
objects and connections. On the basis of the latter the structures
of the knowledge comprised in the knowledge base are formed. We call such
processors semantics-oriented. Their special feature is the employment
of the linguistic knowledge (LK), organized in such a way as to consider
lexical and semantic special features of natural language with the formation of
the knowledge structures [1,14 ]. At the level of KB it is possible to consider
more fully the needs of the users for decision of the following tasks.
First, due to the use
of the reverse linguistic processors the formation of reports, filling the
required table forms and relational databases have become possible.
Second, due to the
support of the expert component, it is possible to ensure the updating of the
information by the analytical results, obtained via processing the knowledge
structures.
Third, intelligent
features are provided due to the organization of different types of search: the search for concrete objects, the search
for similar objects, the search for connections, etc. Such forms of search
relate to the "semantic" facilities, since the results are achieved
not at the level of words or word forms, but at the level of the knowledge
structures from KB. We call the systems of this type semantics- oriented.
During the last fifteen years on the basis of the studies
conducted at the Institute for Informatics Problems of the Russian Academy of
Sciences the semantics- oriented systems and linguistic processors have been
developed for the formalization of natural language texts and their analytical
processing for different subject areas: criminology (summary of incidents,
accusatory conclusions, etc.), the Media (documents about terrorist activities),
personnel management (autobiographies in the Russian and English languages).
These are integrated systems DIEZ, IKS, "Analyst, "Criminal",
Lingua-Master [12-17].
This paper presents a discussion of special features of these
systems, the linguistic processors and knowledge bases employed in them
determined by the tasks and specific character of natural language.
2 Criminology
information objects
The flows of documents in the criminal police comprise the
summaries of incidents, information on the criminal cases, accusatory
conclusions, etc. In these documents much concrete information is contained
which concerns figurants, their acts, the instruments of crime and other facts.
The basic tasks are different forms of search. Note that monthly accumulated
volumes of new information of this type comprise tens and hundreds of
megabytes. No one can read all this and hold it in the head. The full-text data
bases do not solve this problem, since working with the NL texts they produce
much noise (excessive documents) and significant loss of information. The
reason for this is a special feature of the Russian language: the free order of
words. The words relevant for the query can be scattered in the text of a
document and relate to different entities. For eliminating these deficiencies
the criteria of words proximity are introduced, they cut the endings of word
forms (normalization process) and carry out the indexing of the normalized
words, however, this does not radically solve the problem.
Another approach is
the use of relational data bases. But for this the labor-consuming work of
specially trained people is required on formalization of NL texts: extraction
from the documents (incident descriptions) of persons, addresses, dates... and
filling the corresponding tables in a data base. It is extremely difficult to
make this with the large flows of documents.
For this task the
system "Criminal" was developed at the end of the 90-ies [12,13]. Its special feature is automatic analysis of text with
the extraction of the necessary collection of information objects. The
"Criminal" system was verified on 500 thousand incidents from the
summaries of Moscow Criminal Police Office (GUVD), and it showed the unique
results on the basic objects: coefficient of noise (excessive words in the
objects) was not more than 1-2% and losses were not more than 3%.
The following basic
objects must be singled out (with minimum loss):
• persons
(by family name, given name and patronymic - FNP) with their role features
(criminal, victim);
• the
verbal description of the persons, their distinctive signs;
• address,
posting information attributes;
• date(s)
mentioned;
• weapon
with its special features;
• telephone
numbers, faxes, e-mails with their subsequent standardization;
• the
means of transport with the indication of the vehicle type, its state number,
color and other attributes;
• passport
data and other documents with their attributes;
• explosives
and narcotic substances;
• police
departments;
• the
police officers.
Secondary objects (their loss is less
fatal):
• organizations;
• positions;
• quantitative
characteristics (how many persons or other objects participated in an event);
• the
numbers of accounts, sums of money with the indication of the currency type.
Connections:
• event
(criminal, terrorist, breakdown of articles and so on) with the indication of
the information objects participation in them;
• time
and the place of events;
• the connection
between different types of information objects (with whom a person works in an
organization, or lives at the same address, in what events participated
together with other objects, etc.).
Some difficulties of
the objects extraction from texts consist in the following. First, the
difficulties, connected with the special features of the Russian language.
These are the free order of words, the presence of homonymy and polysemy, the variety of language forms for expression of one
and the same meaning (synonymy). For example, any event can be expressed with
the aid of the verbal forms, verbal nouns, participial constructions, etc. they
must be reduced to one form.
Second,
the presence (especially in the summaries of incidents) of a large number of
reductions, which must be deciphered via the analysis of context. For example, g. can
indicate YR, CITY, STATE. and
other.
Third, there are many
omissions. For example, after a figurant the address is written, year of birth
and other data. They must be connected with the figurant.
An important task is
the identification of objects (figurants) in the entire text, the use for these
purposes of indicative pronouns, brief names, anaphoric
references. This is especially necessary for the accusatory conclusions
(verdicts), where one and the same person is mentioned repeatedly (by different
methods of naming) throughout the entire document. Taking into account the
difficulties and in accordance with the tasks the linguistic processor of the
"Criminal" system was developed, which achieves normalization of
words, their grouping with the formation of units, the identification of
objects and the establishment of connections. As a result for each NL document
a semantic network called the meaningful document portrait was
constructed automatically. The latter are the knowledge structures of the
knowledge base which serve the basis for implementing different forms of
semantic search : the search by features and connections, the search for the
objects connected at different levels, the search for similar figurants and
incidents, the search by distinctive signs (with the use of ontologies).
The expert component
is supported for the classification of incidents by the catalogs of the
criminal police: the "form of crime", the "method of the
accomplishment of crime" and others. The result is introduced into the
meaningful portrait. There is a complete set for tuning to the subject area.
3 Personnel management tasks
One of the key problems of personnel and recruiting
agencies is connected with automatic processing of autobiographical data,
claims for work (resume), written in a sufficiently arbitrary form, i.e. in the
form of NL text. Such texts contain the following information about a person:
family name, given name and patronymic of a person (FNP), year of birth,
address, the time and the place of studies with the indication of the
educational establishment designation, etc.
Their automatic formalization is required with
extraction of information objects and their mapping into the fields of an
assigned form or a web site. Then the use of database standard means for the
solution of user tasks becomes possible. This formalization is done by hand in
many agencies: by specially trained people, or by an
applicant who is proposed to introduce the information into the indicated
fields of the required form. This work is sufficiently labor-consuming.
The linguistic processor of the
"Criminal" system was taken as the prototype for automation of these
works. However, it was customized in accordance with the special features of
the subject area [17].
First, this is the need for extracting another
collection of objects and connections.
Second, their division into the groups is
different. For example, the grouping of objects
(organizations, dates and of others.) into those, which relate either to the
studies or to the professional activity of a person.
Third, the need for using the expert systems
for logical infer of the data stated implicitly. We refer to such data as expert
objects.
Basic objects:
• a person, who
composes thea claim (as a rule, at the very beginning
of claim);
• the date of
birth or age;
• e-mail;
• postal
address;
• home
telephone;
• cell phone;
• office telephone;
• personal
Internet- page;
• the desired
position;
STUDIES
•
the name of an educational institution;
• department
(specialty);
•
diploma (degree);
• the beginning
of studies (date);
• the end of
studies (date);
PROFESSIONAL EXPERIENCE
• the beginning
of work (date);
• the end of
work (date);
• the name of
organization;
• the held
position;
• responsibility,
function, achievement.
COURSES (instruction)
• the
conducting organization;
• name of the
course;
• diploma
(certificate);
•
the beginning of the course;
• the end of
the course.
Expert objects:
• gender;
• education
(secondary, higher and other);
• professional
area (according to the assigned classification);
• specialization
(according to the assigned classification);
• work experience (the number of years is
summarized);
• the
region (it is calculated from the address).
The extraction of a
major part of these objects required only the modification of linguistic
knowledge (LK). However, special features of the texts and the tasks decided
required the enforcement of the linguistic processor facilities. This was
caused by the following factors.
First, by the variety
of the NL forms expressing the dates and time intervals. For example, dates can
be in the contracted form (avg.05), in the form of fractions (09.99 g.),
different kinds of special signs or quotation marks (09/99 or of 09'y999), etc.
the intervals: 15.05-01.12.99 or May-June 06 and other variants. The
difficulties caused by their confusion with the fractions, the absence of the
keywords of the type g. (yr), etc. Moreover, one of the requirements was
bringing the dates to the standard form – i.e. the interpretation of
contractions.
Second, certain
difficulties were caused by the tasks of grouping the objects into the types
and composing the rules of their layout. For example, comparatively frequently
in the resume such objects as organizations (where a person worked or studied),
positions, periods of work and basic responsibilities are sequenced
arbitrarily. If a time interval of work in any organization is recorded at the
end and another organization is mentioned further, then it is necessary to know
how to determine, where to assign this time interval. Time intervals, dates or
other organizations (for example, the customers of a project) can stand, also,
inside the text of the description of work, which causes additional
difficulties. A human can easily understand what relates to what. But it is
sufficiently difficult to design the formal criteria of separation and
correlation, which would give a tolerable degree of noise and losses. For this
objective special means were introduced in the linguistic processor, which,
relying on dates (or organizations), performed a search for the objects
connected with them.
Third, many users created their resumes on the
basis of the documents, taken from different tables, forms. As a consequence,
the absence of punctuation marks (periods), the presence of special signs,
which remained after recoding of the text. All resumes
(if there were no empty lines) were interpreted as one sentence. To overcome this the block of morphological-lexical analysis was
supplied with the special means for tuning, i.e. the rule for separating the
sentences. For example, if a word is a verb written with the capital letter and
occupies the first position in a line, then this is the beginning of a
sentence. There are many such heuristic rules including those which consider
the role of special signs, separating symbols, etc.
Fourth, for obtaining expert data (objects)
the expert systems (ES) were build into LP, which relate a document to a
specific category (point of classifier) on the basis of the meaningful
portraits analysis. Two types of shells for the ES are realized in the system.
The first is based on the weight coefficients of the words, which correspond to
the specific category. The second is based on the presence of words in the
information objects.
In ES of the first type the words are
connected with each category with the indication of their weights. Such weights
are the result of the standard documents statistical analysis (analyzed by
human), i.e., the stage of instruction is envisaged and machine learning
methods employment.
In ES of the second type with each category
the characteristic words or pairs of words (word combinations) are connected
taken from the fragments which correspond to the information objects of the
type indicated. One and the same word or word combination can be related only
with one category.
And finally, the need for a
reverse linguistic processor which would serve for converting the objects into NL
components and mapping them into the fields of a form or a web site. This processor has its linguistic knowledge,
with the aid of which the sequence of the headings (fields) delivery is
assigned and the expectations with what objects they must be filled are
specified. For extraction of such objects their names (ORG _, WORK _,...), and the connections, given in the meaningful portrait
are used. For each selected object its description is constructed of the normalized
words which constitute this object. Further, via the object its sentence is
located. Due to the means of positioning the sentence place in the text is
located. According to the description of the object in this interval a piece of
sentence which corresponds to the object searched for. This piece is the
resulting output.
4 Documents of the media on terrorist activities
The problem of
information support for antiterrorist activities in the contemporary world is
very acute and attracts the attention of researchers; however, the working knowledge
extraction systems for
this field only begin to be created [18]. The principal task here is the
extraction of the documents which relate to the terrorist activity from the
flow of media communications, with the subsequent analysis of these documents. The
linguistic processor of the "Criminal" system was taken as the
prototype for automation of these works. It was developed in accordance with
the special features of the subject area and the tasks. In LP the following
information objects were additionally introduced:
• terrorist
groups and organization (Terrorizm);
• participants
of terrorist groups with the indication of their roles (leader, head of, etc.);
• the
armed forces, assigned for antiterrorist combat (Military_.Force);
• time
intervals (see Section 2).
We developed linguistic
knowledge (LK) for the extraction of these objects. In accordance with the
specific text characteristics LK was augmented by the new rules for the extraction
of objects, for example, the extraction of the place of event in the forms such
as "in 25 km from Kabul" or "the camp near Umma
city" and so forth. The character of composite names with their elements
of Abu (father), Ibn or Ben (son) was taken into
account. For example, Abd ar-Rasul, Ben-Achmad.
Accordingly, the FNP field became more complicated. For well-known terrorists
the reduced names are used, as a rule, for example, Ben Laden (instead of Osama
Ben Laden), Basayev (Shamil'
Basayev), etc. Special means of their identification were
introduced into the linguistic processor.
As in the previous
cases, for extracting objects all versions of an object name including the
brief form possible in the text were considered. Standard objects (FNP, dates,
addresses, the forms of weapon and others) are reduced to one (standard) form. The
identification of objects is performed taking into account brief designations
(for example, separate surnames or names with FNP), anaphoric references
(indicative and personal pronouns, for example, "this person",
"it...") definitions and explanations (for example, "the mayor
of Moscow Luzhkov" is identified with the
subsequent words "mayor", " Luzhkov "). For the extraction of events and
connections the analysis of verbal forms, participial and adverbial
constructions is carried out.
At the same time the
basic task of the LP use differed from the previous cases: this was the need
for operation (as a separate module) within the framework of the integrated
systems of information collection and processing. The exchange was conducted
through XML- files [20]. For that end a reverse LP was developed, which
constructs XML- files on the basis of meaningful portraits (see Appendix 1).
Thus, the input for
the linguistic processor (LP) is a natural language text, and the output is an XML-
file, where all chosen objects and connections with the indication of sources
are represented. This LP named Semantix is provided
in the form of an SDK- module. It works under WINDOWS, but it can be recompiled
for the work under LINUX.
The Semantix Processor is an independent module and it can be
used without the mentioned systems for the standard tasks of analytical
services. There are means of tuning to the objects of other types - due to the
linguistic knowledge or the dictionaries.
Let us give some
explanations. Each object has the following structure:
<OBJECT
ID="7" TYPE="Organization">
<ARG
CONST="Headquarters />
<ARG
CONST="Residence" />
...
<SOURCE> Headquarters residence of the opposing group</SOURCE>
</OBJECT>
where ID="7" – is an identification of an object,
the TYPE="Organization" is its type. The text component corresponding
to the object is also given. Objects relations and their participation in the
actions are given through the REF=... references. For example, with the help of the following
construction
<ACTION ID="15" TYPE="Blow">
<ARG CONST="At" />
<ARG REF="7" />
</ACTION>
where the sentence "one
of the blows struck the headquarters of the oppositional group" is
represented. For each object or action the reference to the sentence is given. The
Semantix processor uses sufficiently universal
constructions of XML- file: one object (through the reference) can include
another object. Properties are given as arguments. If necessary the type of
attribute is indicated.
For example, in the statement
<ATTR TYPE="YEAR"
VALUE="2003"/>
the year is indicated,
etc. An XML file has a complete set of information items necessary for the use
in different integrated systems.
5 Conclusion
The Objective-
oriented linguistic processors can be used in different areas of application
where the extraction of useful information from natural language texts is
required. In this case, the processors, described in this work, possess a
number of essential advantages. The recently appeared systems such as Integro Ontos, Arion, etc. (as far as we know) extract only the objects of
several types. As a rule, these are person, organization, date, address.
In the processors of the
Semantix, Lingua-Master, “Criminal” systems up to 40
types of objects are extracted with high accuracy and minimum noise. For
example, the system "Criminal" was verified on about 500 thousand
incidents from the summaries of Moscow Criminal Police Department, and on the
basic objects showed the unique results: the coefficient of noise, i.e. excessive
words in the objects) is not more than 1-2% and losses are not more than 3%. The
Semantix Processor was fixed on a smaller quantity of
documents dealing with the terrorist activity, and therefore there can be more
noise and losses in it. But this can be quickly fixed. The fact is that to
consider everything which can be encountered in the NL texts is impossible.
Therefore, in the first place, the representative collections of test documents
are extremely important, and in the second place, the means of fixing or tuning
of linguistic processors are as follows: the employment of hybrid approaches
comprising hand-made rules and statistical means for rapid correction and fine
adjustment of linguistic knowledge.
In our systems there
is an entire complex of such means which ensure rapid tuning to the applications
(including the introduction of new objects and connections) taking into account
the demands of customers [19]. Note that in the
mentioned processors the objects are brought to the standard form (for example,
FNP, address, date) with the indication of the types of components. A
sufficiently in-depth analysis of sentences is conducted with the development
of verbal forms, and also with the identification of objects of the entire
text. The analysis of the complex language structures is ensured: forms with verbal nouns,
participial and adverbial constructions, coordinated terms, etc. is supported by
the expert component. The Semantix processor can be
used as a stand-alone (independent) module [21]. At present the English language
version of the object - oriented linguistic processor Semantix
[15,16,19, 21 ] is being developed.
References
[1] Kuznetsov, I.P.
Semanticheskie Predstavleniia. Moscow: Nauka,
1986á, 290 p.
[2] Cunningham, H. Automatic Information
Extraction // Encyclopedia of Language and Linguistics, 2cnd ed. Elsevier,
2005.
[3] Han J. and Kamber, M.
Data Mining: Concepts and Techniques // Morgan Kaufmann, 2006.
[4] FASTUS:a Cascaded
Finite-State Trasducerfor Extracting Information from
Natural-Language Text. // AIC, SRI International. Menlo Park. California, 1996.
[5] Ferrucci, D. and Lally, A.
UIMA: an architectural approach to unstructured information processing in the
corporate research environment // Natural Language Engineering 10 (3/4), 2004, 327–348.
[6] Byrd, R. and Ravin, Y.
Identifying and Extracting Relations in Text // 4th International Conference on
Applications of Natural Language to Information Systems (NLDB). Klagenfurt,
Austria, 1999.
[7] Popov, B. et al. KIM - A Semantic Platform for Information
Extraction and Retrieval // Journal of Natural Language Engineering, 10(3-4),
2004, pp. 375-392.
[8] Doddington, G. et al. Automatic
Content Extraction (ACE) program - task definitions and performance measures //
Fourth International Conference on Language Resources and Evaluation (LREC),
2004.
[9] Han, J., Pei Y. Yin, and
Mao, R. Mining Frequent Patterns without Candidate Generation: A
Frequent-Pattern Tree Approach,” // Data
Mining and Knowledge Discovery, 8(1), 2004, pp. 53–87.
[10] Dong, G. and J. Li. Efficient mining of emerging patterns:
Discovering trends and differences // Proceedings of the Fifth ACM SIGKDD
International Conference on Knowledge Discovery and DataMining,
S. Chaudhui and
D. Madigan, editors,
ACM Press, San Diego, CA, 1999, pp. 43–52.
[11] Kozerenko,
E.B. Multilingual Processors: a Unified Approach to Semantic and Syntactic
Knowledge Presentation. In Proceedings of the International
Conference on Artificial Intelligence IC-AI'2001. H.R. Arabnia (ed.), Las Vegas, Nevada, USA, June 25-28, 2001. CSREA Press, 2001, pp.1277-1282.
[12] Kuznetsov I.P. Methods of Processing
Reports with the Extraction of Figurants and Events Features // In Dialogue'99: Proceedings of the International Workshop
"Computational Linguistics and its Applications", Vol.2, Tarusa, 1999.
[13] Kuznetsov I.P., Matskevich A.G. The System for Extracting Semantic
Information from Natural Language Texts // Proceedings of the Dialog International
Workshop "Computational Linguistics and its Applications", Vol.2,
Moscow: Nauka, 2002.
[14] Kuznetsov I.P. Natural
Language Texts Processing Employing the Knowledge Base Technology // Sistemy i Sredstva
Informatiki, Vol.13, Moscow: Nauka,
2003, pp. 241-250.
[15] Kuznetsov, I., Kozerenko, E. The system for extracting semantic
information from natural language texts // Proceeding of International
Conference on Machine Learning. MLMTA-03, Las Vegas US, 23-26 June 2003, p.
75-80.
[16] Kuznetsov I.P., Matskevich A.G. The English Language Version of
Automatic Extraction of Meaningful Information from Natural Language Texts // Proceedings
of the Dialog-2005 International Conference "Computational Linguistics and
Intelligent Technologies", Zvenigorod, 2005pp. 303-311.
[17] Kuznetsov I.P., Matskevich A.G. Semantics Oriented Linguistic
Processor for Automatic Formalization of Autobiographical Data // Proceedings
of the Dialog-2006 International Conference "Computational Linguistics and
Intelligent Technologies", Bekasovo, 2006, pp. 317-322.
[18] Voss, S. and Joslyn C.A.
Advanced Knowledge Integration in Assessing Terrorist Threats // LANL Technical
Report LAUR 02-7867,
2002.
[19] Somin N.V., Solovyova N.S., Charnine M.M
The System for Morphological Analysis: the Experience
of Employment and Modification // Sistemy i Sredstva Informatiki,
Vol. 15 Moscow: Nauka, 2005, pp. 20-30.
[20] Gardner, J. R. and Z. L. Rendon, XSLT and XPATH:
A Guide to XML Transformations, Prentice Hall, 2001.
[21] Web site with the demo version of the Semantix
system: http://semantix4you.com
Appendix 1
Input Text:
12:16 27.12.2002 One
of leaders of insurgents - Arabian Abu-Tarik isdestroyed in the Chechen Republic. In the
Chechen Republic one of leaders of Islam terroristic groupthe
mercenary Abu-Tarik - assistant of Abu al-Valod, successor ofHattab, is
destroyed. As has informed the Ministry of Foreign Affairsof
the Chechen Republic, joint forces of Chechen special militia anddivisions of federal forces destroyed the insurgent in
settlement StaryeAtagi of Groznensky
region during the addressed check up. In one of the houses there
were found the hiding place with theconfidential
Arabian documents, three sub-machine guns andgrenades,
ammunition. There are no losses among the participantsof the operation.
XML-file (Semantix
output):
<?xml version=`1.0`
encoding=`windows-1251`?>
<DOCUMENT DOC_NUM=`0`>
<OBJECT ID=`1` TYPE=`Date`>
<ATTR TYPE=`YEAR` VALUE=`2002`/>
<ATTR TYPE=`MONTH` VALUE=`DEC.`/>
<ATTR TYPE=`DAY` VALUE=`27`/>
<ATTR TYPE=`HOUR` VALUE=`12`/>
<ATTR TYPE=`MINUTE` VALUE=`16`/>
<SOURCE> 12 16 27. 12.</SOURCE>
</OBJECT>
<OBJECT ID=`2` TYPE=`Terrorizm`>
<ARG CONST=`1`/>
<ARG CONST=`LEADER`/>
<ARG CONST=`OF`/>
<ARG CONST=`INSURGENT`/>
<SOURCE> Leaders of
insurgents</SOURCE>
</OBJECT>
<OBJECT ID=`3` TYPE=`FIO`>
<ATTR TYPE=`SURNAME` VALUE=`ABU -
TARIK`/>
<SOURCE> Abu tarik
-</SOURCE>
</OBJECT>
<OBJECT ID=`4` TYPE=`Place`>
<ARG CONST=`CHECHEN`/>
<ARG CONST=`REPUBLIC`/>
<SOURCE> Chechen
Republic</SOURCE>
</OBJECT>
<OBJECT ID=`5` TYPE=`Terrorizm`>
<ARG CONST=`1`/>
<ARG CONST=`LEADER`/>
<ARG CONST=`OF`/>
<ARG CONST=`ISLAM`/>
<ARG CONST=`TERRORISTIC`/>
<ARG CONST=`GROUP`/>
<SOURCE> Leaders of Islam terroristic
group</SOURCE>
</OBJECT>
<OBJECT ID=`6` TYPE=`FIO`>
<ATTR TYPE=`SURNAME` VALUE=`ABU`/>
<ATTR TYPE=`NAME` VALUE=`AL-VALOD`/>
<SOURCE> Abu al Valod</SOURCE>
</OBJECT>
<OBJECT ID=`7` TYPE=`FIO`>
<ATTR TYPE=`SURNAME` VALUE=`HATTAB`/>
<ATTR TYPE=`NAME` VALUE=`HASAN`/>
<SOURCE> Hattab</SOURCE>
</OBJECT>
<OBJECT ID=`8` TYPE=`Organization`>
<ARG CONST=`MINISTRY`/>
<ARG CONST=`OF`/>
<ARG CONST=`FOREIGN`/>
<ARG CONST=`AFFAIR`/>
<ARG CONST=`OF`/>
<ARG CONST=`CHECHEN`/>
<ARG CONST=`REPUBLIC`/>
<SOURCE> Ministry of Foreign Affairs
of the Chechen Republic</SOURCE>
</OBJECT>
<OBJECT ID=`9` TYPE=`Organization`>
<ARG CONST=`CHECHEN`/>
<ARG CONST=`SPECIAL`/>
<ARG CONST=`MILITIA`/>
<SOURCE> Chechen special
militia</SOURCE>
</OBJECT>
<OBJECT ID=`10` TYPE=`Military_Force`>
<ARG CONST=`JOINT`/>
<ARG CONST=`FORCE`/>
<ARG CONST=`OF`/>
<ARG REF=`9`/>
<SOURCE> Joint forces of Chechen
special militia</SOURCE>
</OBJECT>
<OBJECT ID=`11` TYPE=`Military_Force`>
<ARG CONST=`DIVISION`/>
<ARG CONST=`OF`/>
<ARG CONST=`FEDERAL`/>
<ARG CONST=`FORCES`/>
<SOURCE> Divisions of federal
forces</SOURCE>
</OBJECT>
<OBJECT ID=`12` TYPE=`Place`>
<ARG CONST=`SETTLEMENT`/>
<ARG CONST=`STARYE`/>
<ARG CONST=`ATAGI`/>
<ARG CONST=`OF`/>
<ARG CONST=`GROZNENSKY`/>
<ARG CONST=`REGION`/>
<SOURCE> settlement Starye Atagi of Groznensky region</SOURCE>
</OBJECT>
<OBJECT ID=`13` TYPE=`Weapon`>
<ARG CONST=`SUB `/>
<ARG CONST=`MACHINE`/>
<ARG CONST=`GUN`/>
<SOURCE> Sub machine
guns</SOURCE>
</OBJECT>
<OBJECT ID=`14` TYPE=`Weapon`>
<ARG CONST=`GRENADE`/>
<SOURCE> Grenades</SOURCE>
</OBJECT>
<OBJECT ID=`15` TYPE=`Position`>
<ARG CONST=`PARTICIPANT`/>
<ARG CONST=`OF`/>
<ARG CONST=`OPERATION`/>
<SOURCE> Participants of the
operation</SOURCE>
</OBJECT>
<RELATION TYPE=`SUCCESSOR`>
<ARG REF=`6`/>
<ARG REF=`7`/>
</RELATION>
<RELATION TYPE=`ASSISTANT`>
<ARG REF=`3`/>
<ARG REF=`6`/>
<ACTION ID=`16` TYPE=`DESTROY`>
<ARG CONST=`ARABIAN`/>
<ARG REF=`3`/>
</ACTION>
<RELATION TYPE=`Where`>
<ARG REF=`16`/>
<ARG REF=`4`/>
</RELATION>
<ACTION ID=`17` TYPE=`INFORM`>
<ARG REF=`8`/>
</ACTION>
<ACTION ID=`18` TYPE=`DESTROY`>
<ARG REF=`10`/>
<ARG REF=`11`/>
</ACTION>
<ACTION ID=`19` TYPE=`CHECK UP`>
<ARG CONST=`ADDRESS`/>
</ACTION>
<ACTION ID=`20` TYPE=`FIND`>
<ARG CONST=`1`/>
<ARG CONST=`HOUSE`/>
<ARG CONST=`HIDE`/>
<ARG CONST=`PLACE`/>
<ARG CONST=`CONFIDENTIAL`/>
<ARG CONST=`ARABIAN`/>
<ARG CONST=`DOCUMENT`/>
</ACTION>
<ACTION ID=`21` TYPE=`BE NO`>
<ARG CONST=`LOSS`/>
<ARG REF=`15`/>
</ACTION>
<SENTENCE>
<ARG REF=`1`/>
<ARG REF=`2`/>
<ARG REF=`16`/>
<SOURCE>12:16 27.12.2002 One of leaders of insurgents - ArabianAbu-Tarik is destroyed in the Chechen Republic.
</SOURCE>
</SENTENCE>
<SENTENCE>
<ARG REF=`4`/>
<ARG REF=`5`/>
<ARG CONST=`MERCENARY`/>
<ARG REF=`3`/>
<ARG REF=`6`/>
<ARG REF=`7`/>
<ARG CONST=`DESTROY`/>
<SOURCE>In the Chechen Republic one of leaders of Islam terroristicgroup the mercenary Abu-Tarik
- assistant of Abu al-Valod, successorof
Hattab, is destroyed. </SOURCE>
</SENTENCE>
<SENTENCE>
<ARG REF=`17`/>
<ARG REF=`18`/>
<ARG REF=`19`/>
<SOURCE>As have informed the Ministry of Foreign Affairs of theChechen Republic, joint forces of Chechen special
militia anddivisions of federal forces destroy the
insurgent in settlement StaryeAtagi of Groznensky region during the addressed check up.
</SOURCE>
</SENTENCE>
<SENTENCE>
<ARG REF=`20`/>
<ARG CONST=`3`/>
<ARG REF=`13`/>
<ARG CONST=`AND`/>
<ARG REF=`14`/>
<ARG CONST=`AMMUNITION`/>
<SOURCE>In one of the houses there were found the hiding place withthe confidential Arabian documents, three sub-machine
guns and grenades, ammunition. </SOURCE>
</SENTENCE>
<SENTENCE>
<ARG REF=`21`/>
<SOURCE>There are no losses among the participants of theoperation</SOURCE>
</SENTENCE>
</DOCUMENT>