Computer Science
Recent Submissions

Stojadinović, Mirko (Beograd , 2016)[more][less]
Abstract: Many realworld problems can be modeled as constraint satisfaction problems (CSPs) and then solved by one of many available techniques for solving these problems. One of the techniques is reduction to SAT, i.e. Boolean Satisfiability Problem. Variables and constraints of CSP are translated (encoded) to SAT instance, that is then solved by stateoftheart SAT solvers and solution, if exists, is translated to the solution of the original CSP. The main aim of this thesis is to improve CSP solving techniques that are using reduction to SAT. Two new hybrid encodings of CSPs to SAT are presented and they combine good sides of the existing encodings. We give the proof of correctness of one encoding that did not exist in literature. We developed system meSAT that enables reduction of CSPs to SAT by using 4 basic and 2 hybrid encodings. The system also enables solving of CSPs by reduction to two problems related to SAT, SMT and PB. We developed a portfolio for automated selection of encoding/solver to be used on some new instance that needs to be solved. The developed portfolio is comparable with the stateoftheart portfolios. We developed a hybrid approach based on short solving timeouts with the aim of significantly reducing the preparation time of a portfolio. By using this approach, we got results comparable to the ones obtained by using preparation time of usual length. We made comparison between several machine learning techniques with the aim to find out which one is the best suited for the short training approach. The problem of assigning air traffic controllers to shifts is described and three models of this problem are presented. We used a large number of different solving methods and a diverse set of solvers for solving this problem. We developed optimization techniques that aim to find optimal solutions of the problem. A hybrid technique combining reduction to SAT and local search is shown to be the most efficient one. We also considered sudoku puzzles and the existing techniques of solving the puzzles of greater size than 9 9. Amongst the used techniques, the existing reduction to SAT is the most efficient in solving these puzzles. We improved the existing algorithm for generating large sudoku puzzles. It is shown that simple preprocessing rules additionally improve speed of generating large sudokus. URI: http://hdl.handle.net/123456789/4427 Files in this item: 1
MirkoStojadinovicTeza.pdf ( 2.030Mb ) 
Mišković, Stefan (Beograd , 2016)[more][less]
Abstract: In this dissertation, three NPhard minmax discrete optimization problems are considered. The rst considered problem is multiperiod emergency service location problem, the second one is dynamic maximal covering location problem with multiple covering radii, and the third one is uncapacitated multiple allocation phub center problem. In many practical situations, input parameters (such as user demands, transportation time or cost) often vary with unknown distributions. Therefore, it is necessary to involve these uncertainties in the deterministic variants of the problems by applying robust optimization approach. Mathematical models for the deterministic and nondeterministic variants of all three problems are developed, except for the deterministic uncapacitated multiple allocation phub center problem, which has already been addressed in the literature. In addition, for the rst time in the literature, it was proven that the emergency service location problem is NPhard. The considered problems and their robust variants have numerous applications, due to the fact that in reallife situations input parameters are often subject to uncertainty. Multiperiod emergency service location problem may be used when determining optimal locations for police stations, re brigades, ambulances, and other emergency units in the given region. The dynamic maximal covering location problem with multiple covering radii is useful when choosing the optimal strategy for establishing resources (service centers, suppliers, facilities, etc.) with maximal satisfaction of customer demands in a certain region, by assuming that the service e ciency directly depends on the distance between customer and service center (i.e., the selected coverage radius). The uncapacitated multiple allocation phub center problem has signi cant applications in designing telecommunication and transportation networks, postal delivery systems, emergency systems, supply networks, etc. Since exact methods provide optimal solutions only for problem instances of small dimensions, hybrid metaheuristic algorithms are developed to solve both deterministic and robust variants of the considered problems. The proposed hybrid algorithms are obtained by combining particle swarm optimization, with local search heuristic { classical local search or variable neighborhood search method. For dynamic maximal covering location problem with multiple covering radii, a hybridization of metaheuristic algorithm with exact method based on linear programming is developed. All elements of the proposed algorithms are adopted to the problems under consideration. Di erent strategies are implemented for improving the e ciency of proposed algorithms, especially for the calculation of the objective function value and the local search part. The in uence of di erent parameters of hybrid algorithms on the solution quality is analyzed in detail. All parameters are adjusted by using analysis of variance. For all considered problems (both deterministic and robust variant), the performance of the proposed hybrid algorithms is evaluated on adequate test data sets. The proposed algorithms are compared with existing heuristic from the literature and exact methods incorporated in commercial CPLEX solver. The obtained experimental results indicate the e ciency of proposed algorithms in obtaining high quality solutions for all considered test instances. The presented comparative analysis indicates the advantages of the proposed hybrid algorithms over existing methods in the sense of solution quality and/or required computational time, especially in the case of large problem dimensions. The results presented in this paper represent a contribution to the eld of discrete optimization, robust optimization and metaheuristic methods. URI: http://hdl.handle.net/123456789/4423 Files in this item: 1
Miskovic_Stefan_teza.pdf ( 1.773Mb ) 
Mladenović, Miljana (Beograd , 2016)[more][less]
Abstract: The beginning of the new millennium was marked by huge development of social networks, internet technologies in the cloud and applications of artificial intelligence tools on the web. Extremely rapid growth in the number of articles on the Internet (blogs, ecommerce websites, forums, discussion groups, and systems for transmission of short messages, social networks and portals for publishing news) has increased the need for developing methods of rapid, comprehensive and accurate analysis of the text. Therefore, remarkable development of language technologies has enabled their applying in processes of document classification, document clustering, information retrieval, word sense disambiguation, text extraction, machine translation, computer speech recognition, natural language generation, sentiment analysis, etc. In computational linguistics, several different names for the area concerning processing of emotions in text are in use: sentiment classification, opinion mining, sentiment analysis, sentiment extraction. According to the nature and the methods used, sentiment analysis in text belongs to the field of computational linguistics that deals with the classification of text. In the process of analysing of emotions we generally speak of three kinds of text classification: • identification of subjectivity (opinion classification or subjectivity identification) used to divide texts into those that carry emotional content and those that only have factual content • sentiment classification (polarity identification) of texts that carry emotional content into those with positive and those with negative emotional content • determining the strength or intensity of emotional polarity (strength of orientation). In terms of the level at which the analysis of feelings is carried out, there are three methodologies: an analysis at the document level, at the sentence level and at the level of attributes. Standardized methods of text classification usually use machine learning methods or rulebased techniques. Sentiment analysis, as a specific type of classification of documents, also uses these methods. This doctoral thesis, whose main task is the analysis of emotions in text, presents research related to the sentiment classification of texts in Serbian language, using a probabilistic method of machine learning of multinomial logistic regression i.e. maximum entropy method. The aim of this research is to create the first comprehensive, flexible, modular system for sentiment analysis of Serbian language texts, with the help of digital resources such as: semantic networks, specialized lexicons and domain ontologies. This research is divided into two phases. The first phase is related to the development of methods and tools for detecting sentiment polarity of literal meaning of the text. In this part of the work, a new method of reducing the feature vector space for sentiment classification is proposed, implemented and evaluated. The proposed method for reduction is applied in the classification model of maximum entropy, and relies on the use of lexicalsemantic network WordNet and a specialized sentiment lexicon. The proposed method consists of two successive processes. The first process is related to the expansion of feature vector space by the inflectional forms of features. The study has shown that usage of stemming in sentiment analysis as a standard method of reducing feature vector space in text classification, can lead to incomplete or incorrect sentimentpolarity feature labelling, and with the introduction of inflectional feature forms, this problem can be avoided. The paper shows that a feature vector space, increased due to the introduction of inflectional forms, can be successfully reduced using the other proposed procedure – semantic mapping of all predictors with the same sentimentpolarity into a small number of semantic classes. In this way, the feature vector space is reduced compared to the initial one, and it also retains the semantic precision. The second phase of the dissertation describes the design and implementation of formal ontologies of Serbian language rhetorical figures – the domain ontology and the task ontology. Usage of the task ontology in generating features representing figurative speech is presented. The research aim of the second phase is to recognize figurative speech to be used in improving of the existing set of predictors generated in the first phase of the research. The research results in this phase show that some classes of figures of speech can be recognized automatically. In the course of working on this dissertation, a software tool SAFOS (Sentiment Analysis Framework for Serbian), as an integrated system for sentiment classification of text in Serbian language, has been developed, implemented and statistically evaluated. Results of the research within the scope of this thesis are shown in papers (Mladenović & Mitrović, 2013; Mladenović & Mitrović, 2014; Mladenović, Mitrović & Krstev, 2014; Mladenović, Mitrović, Krstev & Vitas, 2015; Mladenović, Mitrović & Krstev, 2016). The dissertation consists of seven chapters with the following structure. Chapter 1 introduces and defines methods, resources and concepts used in the first phase of research: text classification, sentiment classification, machine learning, supervised machine learning, probabilistic supervised machine learning, and language models. At the end of the introductory section, the tasks and objectives of the research have been defined. Chapter 2 presents a mathematical model of text classification methods and classification of sentiment methods. A mathematical model of a probabilistic classification and an application of the probabilistic classification in regression models are presented. At the end of the chapter it is shown that the method using the mathematical model of maximum entropy, as one of the regression models, has been successfully applied to natural language processing tasks. Chapter 3 presents the lexical resources of the Serbian language and the methods and tools of their processing. Chapter 4 deals with the comprehensive research on the currently available types and methods of sentiment classification. It shows the current work and research in sentiment classification of texts. It also presents a comparative overview of research in sentiment classification of texts using the method of maximum entropy. Chapter 5 discusses the contribution of this thesis to methods of feature space reduction for maximum entropy classification. First, a feature space reduction method is analysed. A new feature space reduction method which improves sentiment classification is proposed. А mathematical model containing proposed method is defined. Learning and testing sets and lexicalsemantic resources that are used in the proposed method are introduced. Chapter 5 also describes building and evaluation of a system for sentiment classification – SAFOS, which applies and evaluates the proposed method of a feature vector space reduction. The parameters and the functions of SAFOS are defined. Also, measures for evaluation of the system were discussed – precision, recall, F1measure and accuracy. A description of the method for assessing the statistical significance of a system is given. Also, implementation of the statistical test in the system SAFOS is discussed. The chapter provides an overview of the presented experiments, results and evaluation of the system. Chapter 6 deals with methods of recognizing figurative speech which can improve sentiment classification. The notion of domain ontology is introduced, the role of rhetorical figures and domain ontology of rhetorical figures. The importance of figurative speech in the sentiment classification has been explored. The description of the construction and structure of the first domain ontology of rhetorical figures in Serbian language, RetFig.owl, is given. Also, the description of the construction and structure of the corresponding task ontology that contains rules for identification of some classes of rhetorical figures is given. At the end of this chapter, an overview of the performed experiments, results and evaluation of the SAFOS system plugin that improved the recognition of figurative speech is given. The final chapter of this study deals with the achievemnts, problems and disadvantages of the SAFOS system. The conclusion of this thesis points to the great technological, social, educational and scientific importance of the sentiment analysis and recognition of the figurative speech and gives some routes in further development of the SAFOS system. URI: http://hdl.handle.net/123456789/4422 Files in this item: 1
Mladenovic_Miljana.pdf ( 13.60Mb ) 
Stojanović, Sana (Beograd , 2016)[more][less]
Abstract: The advance of geometry over the centuries can be observed through the development of di erent axiomatic systems that describe it. The use of axiomatic systems begins with Euclid, continues with Hilbert and Tarski, but it doesn't end there. Even today, new axiomatic systems for Euclidean geometry are developed. Avigad's axiomatic system goes back to the beginnings and precisely describes basic derivations presented in Euclid's ½Elements . Writing an axiomatic system in a format suitable for computer theorem proving is a challenge in itself. Imprecise formulations of axioms which appear in books get exposed only when they need to be written in a format suitable for computers. The formalization of di erent axiomatic systems and computerassisted proofs within theories described by them is the main motif of this thesis. The programs for theorem proving have existed since the eighties and today they present a collection of very powerful tools. This thesis presents a system for automated and formal theorem proving which uses the power of resolution theorem provers, a coherent prover, as well as interactive theorem provers for verifying the generated proofs. Coherent prover ArgoCLP is one of the contributions of the thesis. Additionally, the thesis develops a dialect of coherent logic based on natural deduction which enables simple transformation of generated proofs into proofs written in languages of interactive provers Isabelle and Coq as well as in natural languages, English and Serbian. The system for theorem proving is applied to three axiomatic systems of Euclidean geometry, thus illustrating its applicability to both proving the large mathematical theories and veri cation of informal proofs from mathematical textbooks. URI: http://hdl.handle.net/123456789/4416 Files in this item: 1
SanaStojanovic.pdf ( 1.885Mb ) 
Vujičić Stanković, Staša (Beograd , 2016)[more][less]
Abstract: The basic goal of this doctoral thesis is a research into different techniques and models which are applied in information extraction, and providing an informatic support in processing of natural language texts from culinary and gastronomy domain. Information extraction is a subfield of computational linguistics which includes techniques for natural languages processing, in order to find relevant information, define their meaning and establish relations between them. A very special attention is given to ontology based information extraction. It consists of the following: recognition of instances of ontology concepts in non‐structured or semistructured texts written in natural language, reasoning over the identified instances based on the rules defined in the ontology, as well as recognition of instances and their use for instantiating the proper ontology concepts. The main result of thesis reflects in the presentation of a new model for ontology based information extraction. Besides solving tasks of information extraction, the new model includes not only upgrade of existing lexical resources and ontologies, but also creation of the new ones. Its application resulted in development of a system for extraction of information related to the culinary domain, but this new model can be used in other fields as well. Beside this, the food ontology has been developed, Serbian WordNet is extended for another 1.404 synsets from the culinary domain, while electronic dictionary of Serbian is enlarged with 1.248 entries. The significance of the model application comes from the fact that the new and enriched linguistic resources can be used in other systems for natural language processing. The opening chapter of the thesis elaborates the need of providing an informatic model for processing a huge linguistic corpus related to culinary and gastronomy domain, through methodologically precise and solid approach integrating pieces of information on the domain. Also, the formalization of the basic research subject, text in electronic form, has been presented. Further on, the chapter contains a description of the natural languages approximations introduced in order to enable modern information technologies to process texts written in natural languages, and it emphasizes the need to make the characterisation of the text language with corresponding corpus and sublanguage. Further on in the first chapter, the task of information extraction, and the models for informatic processing of non‐structured or semi‐structured texts, used by the computer to interpret the meaning that the author (not necessarily a human) has intended to give while writing the text, are defined. Additionally, this chapter contains the description of the methods used in information extraction field – methods based on rules and methods based on machine learning. Their advantages and shortcomings are listed, so as the reasons why in this thesis are used techniques based on linguistic knowledge. As a conclusion to the introduction chapter, a special attention is given to ontologies, WordNet, and the significance of its usage as ontology. The second chapter contains the presentation of the linguistic resources and tools exploited in this thesis. It describes morphological dictionaries and local grammars used for solving the problem of information extraction from texts written in Serbian. A review of information extraction systems is given subsequently. At the end of the second chapter, the stages in processing of Serbian written texts during the information extraction in the software systems Unitex and GATE are described. The main result of the thesis is presented in the third chapter. It is the model for solving the problem of information extraction by integrating linguistic resources and tools, which includes creation of a text corpus, definition of tasks for information extraction, establishment of finite state models for information extraction, and their application accordingly, iterative enlarging of electronic morphological dictionaries, enrichment and enhancement of WordNet, and creation of new ontologies. Each of these steps is described thoroughly. Even though the model was at first considered as a solution for problems in processing Serbian, it can be equally applied for processing texts written in other languages, with the development of suitable language resources accordingly. The implementation of the above explained steps is described in the fourth chapter, through a system for information extraction from the culinary texts written in Serbian. Then follows the description of a bond in the development and mutual complement of lexical resources through steps in creating domain corpus, identifying culinary lexica, expanding and upgrading of WordNet and electronic morphological dictionaries, and developing of domain ontologies – the food ontology, the approximate measure ontology, and the ontology of ingredients that can be used as mutual replacements in the culinary domain. This system, developed for information extraction, has served for creating an advanced search system which, based on a corpus of culinary texts, generates all possible answers to inquiries made by users. In the frame of this system is implemented a specific method which serves for creation of links between different recipes. This is used in case when the user reviews a text of a recipe and notices that in preparing description features some part which already had appeared in other recipe, but with additional or different explanation. Another contribution of this thesis is application of developed ontologies in tasks that convert approximate measures into standard measures, and establishment of similarities among the recipes. The similarity of the recipes is defined as similarity of texts which describe process of course preparation in accordance with a specific recipe. The last chapter contains final conclusions and directions for future research. URI: http://hdl.handle.net/123456789/4410 Files in this item: 1
teza_Stasa.pdf ( 10.38Mb )