# Browsing Computer Science by Title

Now showing items 1-20 of 28

Next Page-
Vujošević Janičić, Milena (Beograd , 2013)[more][less]
Abstract: LAV is a system for statically verifying program assertions and locating bugs such as buffer overflows, pointer errors and division by zero. LAV is primarily aimed at analyzing programs written in the programming language C. Since LAV uses the popular LLVM intermediate code representation, it can also analyze programs written in other procedural languages. Also, the proposed approach can be used with any other similar intermediate low level code representation. System combines symbolic execution, SAT encoding of program’s control-flow, and elements of bounded model checking. LAV represents the program meaning using first-order logic (FOL) formulas and generates final verification conditions as FOL formulas. Each block of the code (blocks have no internal branchings and no loops) is represented by a FOL formula obtained through symbolic execution. Symbolic execution, however, is not performed between different blocks. Instead, relationships between blocks are modeled by propositional variables encoding transitions between blocks. LAV constructs formulas that encode block semantics once for each block. Then, it combines these formulas with propositional formulas encoding the transitions between the blocks. The resulting compound FOL formulas describe correctness and incorrectness conditions of individual instructions. These formulas are checked by an SMT solver which covers suitable combination of theories. Theories that can be used for modeling correctness conditions are: theory of linear arithmetic, theory of bit-vectors, theory of uninterpreted functions, and theory of arrays. Based on the results obtained from the solver, the analyzed command may be given the status safe (the command does not lead to an error), flawed (the command always leads to an error), unsafe (the command may lead to an error) or unreachable (the command will never be executed). If a command cannot be proved to be safe, LAV translates a potential counterexample from the solver into a program trace that exhibits this error. It also extracts the values of relevant program variables along this trace. The proposed system is implemented in the programming language Ñ++, as a publicly available and open source tool named LAV. LAV has support for several SMT solvers (Boolector, MathSAT, Yices, and Z3). Experimental evaluation on a corpus of C programs, which are designed to demonstrate areas of strengths and weaknesses of different verification techniques, suggests that LAV is competitive with related tools. Also, experimental results show a big advantage of the proposed system compared to symbolic execution applied to programs containing a big number of possible execution paths. The proposed approach allows determining status of commands in programs which are beyond the scope of analysis that can be done by symbolic execution tools. LAV is successfully applied in educational context where it was used for finding bugs in programs written by students at introductory programming course. This application showed that in these programs there is a large number of bugs that a verification tool can efficiently find. Experimental results on a corpus of students’ programs showed that LAV can find bugs that cannot be found by commonly used automated testing techniques. Also, it is shown that LAV can improve evaluation of students’s assignments: (i) by providing useful and helpful feedback to students, which is important in the learning process, and (ii) by improving automated grading process, which is especially important to teachers. URI: http://hdl.handle.net/123456789/4231 ## Files in this item: 1

phdVujosevicJanicicMilena.pdf ( 1.748Mb ) -
Marinković, Vesna (Beograd , 2015)[more][less]
Abstract: The problems of geometry constructions using ruler and compass are one of the oldest and most challenging problems in elementary mathematics. A solution of construction problem is not an illustration, but a procedure that, using given construction primitives, gives a “recipe” how to construct the object sought. The main problem in solving construction problems, both for a human and a computer, is a combinatorial explosion that occurs along the solving process, as well as a procedure of proving solutions correct. In this dissertation a method for automated solving of one class of construction problems, so-called location problems, is proposed. These are the problems in which the task is to construct a triangle if locations of three characteristic points are given. This method successfully proved most of the solvable problems from Wernick’s and Connelly’s list. For each of the problems it is checked if it is symmetric to some of the already solved problems, and then its status is determined: the problem can be found redundant, locus dependent, solvable, or not solvable using existing knowledge. Also, a description of the construction in natural-language form and in language GCLC is automatically generated, accompanied by a corresponding illustration, and correctness proof of the generated construction, followed by the list of conditions when the construction is correct. The proposed method is implemented within the tool ArgoTriCS. For proving generated constructions correct, the system ArgoTriCS uses a newly developed prover ArgoCLP, the automated theorem provers integrated within tools GCLC and OpenGeoProved, as well as the interactive theorem prover Isabelle. It is demonstrated that the proofs obtained can be machine verifiable. Within this dissertation, the system ArgoCLP (developed in collaboration with Sana Stojanovi´c) which is capable of automatically proving theorems in coherent logic is described. This prover is successfully applied to different axiomatic systems. It automatically generates proofs in natural-language form, as well as machineverifiable proofs, whose correctness can be checked using interactive theorem prover Isabelle. The important part of this system is a module for simplification of generated proofs whereby shorter and readable proofs are obtained. URI: http://hdl.handle.net/123456789/4406 ## Files in this item: 1

tezaVesnaMarinkovic.pdf ( 2.233Mb ) -
Alshafah, Samira (Beograd , 2018)[more][less]
Abstract: Proteins with intrinsically disordered regions are involved in large number of key cell processes including signaling, transcription, and chromatin remodeling functions . On the other side, such proteins have been observed in people suffering from neurological and cardiovascular diseases, as well as various malignancies. Process of experimentally determining disordered regions in proteins is a very expensive and long - term process. As a consequence, a various computer programs for predicting position of disordered regions in proteins have been developed and constantly improved. In this thesis a new method for determining Amino acid sequences that characterize ordered/disordered regions is presented. Material used in research includes 4076 viruses wit h more than 190000 proteins. Proposed method is based on defining correspondence between n -grams (including both repeats and palindromic sequence s) characteristics and their belonging to ordered/disordered protein regions. Positions of ordered/disordered regions are predicted using three different predictors. The features of the repetitive strings used in the research include mol e fractions, fract ional differences, and z -values. Also, data mining techniques association rules and classification were applied on both repeats and palindromes. The results obtained by all techniques show a high level of agreement for a short length of less than 6, while the level of agreement grows up to the maximum with increasing the length of the sequences. The high reliability of the results obtained by the data mining techniques shows that there are n -grams, both repeating sequences and palindromes, which uniquely ch aracterize the disordered/ ordered regions of the proteins . The obtained results were verified by comparing with the results based on n- grams from the DisProt database which contain s the positions of experimentally verified disordered regions of the protein. Results can be used both for the fast localization of disordered/ordered regions in proteins as well as for further improving existing programs for their prediction. URI: http://hdl.handle.net/123456789/4746 ## Files in this item: 1

ThesisSamira_Alshafah.pdf ( 3.106Mb ) -
Elfaghine, Halima (Beograd , 2016)[more][less]
Abstract: The subject of this thesis belongs the area of quality control, which represents the practical usage of statistics in following and improving the production process. In 1930 Walter Shewhart started studying quality control, based on control charts, and using statistical principles. Later on, after World War II, Edward Deming took this discipline to Japan, where it ourished. The design and the performance of control charts are the most important problems in this area. The purpose of studying the characteristics of control charts in this thesis is to compare the existing and the suggested control charts. The thesis is divided into four chapters. The rst chapter is introductory and contains motivation and basic de nitions related to this subject. In this study it is always assumed that the data are normally distributed, and that the in-control process data are stationary and uncorrelated. Shewhart control charts and the corresponding limits are constructed in order to meet the given speci cations for the quality characteristic that we investigate. Quality control that is applied to a production process always has costs related to the control. The important parameters connected to the cost of quality control are: width of control limits k, the sample size n and the interval between the samples h. In Chapter 2 a new loss function is given, which is connected to the production process and to X−bar quality control chart. Using Matlab program for optimization, values of ^k; ^n and ^h are found, which minimize the loss function for given costs. For given values of cost, a non-linear regression model is built using a package Sigma plot and the obtained values are compared to those obtained by numerical optimization. In Chapter 3, the time series model Yi = Xi + (1 − )Yi−1 is investigated, where 0 < B 1 is a constant, Xi are N( ; 2) distributed. Exponentially Weighted Moving Average (EWMA) control charts for this model are presented, and type I and type II errors are calculated in the case when i is large. For di erent sample sizes, the new comparison between the optimal design of the X-bar and EWMA control charts for Normally distributed quality characteristic is given, comparing the corresponding cost-loss functions, power functions and average run lengths. i The process of calibration is one of the methods in statistical process control, introduced for improving the quality of the products and for reducing the production costs. In Chapter 4, two new models of non-symmetrical loss function are introduced. Here, the loss function is connected to one product under control (not to the whole sample). Using our program, written in statistical software R, the value which minimizes the expected loss for Shewhart X control chart is found. This value is used as the new central target value of the quality characteristic, that is, the production process is calibrated with this new value. The thesis ends with Conclusions, where the results of the thesis are summarized, and with some open problems to be investigated. URI: http://hdl.handle.net/123456789/4340 ## Files in this item: 1

Disertacija_Halima_Elfaghihe.pdf ( 1.104Mb ) -
Korać, Vanja (Beograd , 2014)[more][less]
Abstract: Digitalna forenzika je multidisciplinarna nauka koja podrazumeva spoj razlicitih nauĉnih disciplina (raĉunarske nauke, pravo, kriminologija) sa brojnim izazovima u uslovima masovnog generisanja digitalnih podataka (Big Data), virtuelizacije klijentske i serverske strane (Cloud Computng), neusaglašenosti standardizacionih tela i opšteg nedostatka brojnih standarda i eksperata u svim disciplinama. Kako se digitalna forenzika odnosi na sve digitalne uraĊaje, uţa nauĉna oblast uklјuĉuje brojne aplikacije digitalne forenzike, kao što su raĉunarska forenzika, forenzika mobilnih ureĊaja, forenzika na sistemima savremenih automobila, senzorskih mreţa itd. U ovom radu je analizirana i primenjena uţa nauĉna oblast raĉunarske forenzike. Opisana je digitalna forenzika raĉunarskih sistema baziranih na Windows i Linux platformi, sa fokusom na odreĊena mesta u implementiranom sistemu proaktivne digitalne forenzike koja mogu ukazati na forenziĉki relevantne dogaĊaje kritiĉne za bezbednost sistema. Opisane su brojne metodologije, tehnologije i tehnike istrage visokotehnološkog kriminala. Proces prikuplјanja podataka i digitalne forenziĉke analize „uţivo―, detalјno je razmatran. Izvršena je kratka revizija karakteristika i tipiĉno zahtevanih funkcionalnosti softverskih forenziĉkih alata, za inicijalni odgovor i oporavak podataka i particija magnetnih diskova. Opisani su i najvaţniji digitalni forenziĉki kompleti alata i njihove osnovne funkcionalnosti. U radu se istiĉu i najznaĉajniji elementi kojima treba posvetiti posebnu paţnju prilikom digitalne forenziĉke analize u virtuelnom okruţenju. TakoĊe su objašnjeni i najvaţniji segmenti samog virtuelnog okruţenja i naĉin na koji oni mogu biti znaĉajni alati, za postupak digitalne forenziĉke analize. U poslednjem delu ovog rada, fokus je usmeren na ranjivosti Windows i Linux platformi sa prikazanim naĉinima zlonamernog proboja sistema. Opisane su opšte ranjivosti i specifiĉne ranjivosti koje se odnose samo na Windows, odnosno samo na Linux platforme. TakoĊe, navedeni su i najĉešći naĉini zlonamernog iskorišćavanja sistema. Ranjivosti raĉunarskih sistema i mreţa mogu se odnositi na programe, hardver, konfiguraciju i lјude. Isklјuĉujući lјude kao najznaĉajniji i istovremeno najkritiĉniji faktor u zaštiti informacija, programske ranjivosti se tipiĉno koriste za online direktne napade, ili napade malicioznim programima. Otkrivanje i otklanjanje ranjivosti sistemskih programa je jedan od glavnih cilјeva digitalne forenzike. Pored skuplјanja forenziĉki relevantnih digitalnih podataka i izgradnje ĉvrstih digitalnih dokaza o kompjuterskom incidentu ili kriminalu za potrebe pravosudnog sistema, cilј digitalne forenziĉke analize je da se iskorišćene ranjivosti trajno otklone i da se incident/protivpravna aktivnost takve vrste više nikada ne ponovi. U tom smislu je doprinos ovog rada veoma znaĉajan. Praktiĉan primer ispitivanja ranjivosti servisa na Windows i Linux platformama obuhvatio je 80 operativnih sistema. Od tog broja, 51 se odnosi na Windows operativne sisteme, a 29 na Linux operativne sisteme. Dobijeni rezultati su rezultat dvogodišnjeg istraţivanja, jer je ispitivanje sistema vršeno u 2011. i 2013. godini. Kroz skeniranje i prikaz ranjivosti difoltno instaliranih Windows i Linux sistema preventivno se otkrivaju ranjivosti koje potencijalno mogu biti iskorišćene od strane bezbednosnih pretnji (maliciozni programi ili zlonamerni napadaĉi) i time ugroziti raĉunarske sisteme i informacije. Proaktivnim otklanjanjem ovih ranjivosti realizuje se preventivna zaštita. Uspostavlјanjem sistema proaktivne forenzike, obezbeĊuje se logovanje forenziĉki relevantnih dogaĊaja, tj. tragova pokušaja napada u realnom vremenu, ĉime se bitno olakšava forenziĉka istraga u sluĉaju incidenta ili protivpravne aktivnosti. URI: http://hdl.handle.net/123456789/3869 ## Files in this item: 1

doktorat_Vanja_Korac.pdf ( 9.093Mb ) -
Vujičić Stanković, Staša (Beograd , 2016)[more][less]
Abstract: The basic goal of this doctoral thesis is a research into different techniques and models which are applied in information extraction, and providing an informatic support in processing of natural language texts from culinary and gastronomy domain. Information extraction is a subfield of computational linguistics which includes techniques for natural languages processing, in order to find relevant information, define their meaning and establish relations between them. A very special attention is given to ontology based information extraction. It consists of the following: recognition of instances of ontology concepts in non‐structured or semistructured texts written in natural language, reasoning over the identified instances based on the rules defined in the ontology, as well as recognition of instances and their use for instantiating the proper ontology concepts. The main result of thesis reflects in the presentation of a new model for ontology based information extraction. Besides solving tasks of information extraction, the new model includes not only upgrade of existing lexical resources and ontologies, but also creation of the new ones. Its application resulted in development of a system for extraction of information related to the culinary domain, but this new model can be used in other fields as well. Beside this, the food ontology has been developed, Serbian WordNet is extended for another 1.404 synsets from the culinary domain, while electronic dictionary of Serbian is enlarged with 1.248 entries. The significance of the model application comes from the fact that the new and enriched linguistic resources can be used in other systems for natural language processing. The opening chapter of the thesis elaborates the need of providing an informatic model for processing a huge linguistic corpus related to culinary and gastronomy domain, through methodologically precise and solid approach integrating pieces of information on the domain. Also, the formalization of the basic research subject, text in electronic form, has been presented. Further on, the chapter contains a description of the natural languages approximations introduced in order to enable modern information technologies to process texts written in natural languages, and it emphasizes the need to make the characterisation of the text language with corresponding corpus and sublanguage. Further on in the first chapter, the task of information extraction, and the models for informatic processing of non‐structured or semi‐structured texts, used by the computer to interpret the meaning that the author (not necessarily a human) has intended to give while writing the text, are defined. Additionally, this chapter contains the description of the methods used in information extraction field – methods based on rules and methods based on machine learning. Their advantages and shortcomings are listed, so as the reasons why in this thesis are used techniques based on linguistic knowledge. As a conclusion to the introduction chapter, a special attention is given to ontologies, WordNet, and the significance of its usage as ontology. The second chapter contains the presentation of the linguistic resources and tools exploited in this thesis. It describes morphological dictionaries and local grammars used for solving the problem of information extraction from texts written in Serbian. A review of information extraction systems is given subsequently. At the end of the second chapter, the stages in processing of Serbian written texts during the information extraction in the software systems Unitex and GATE are described. The main result of the thesis is presented in the third chapter. It is the model for solving the problem of information extraction by integrating linguistic resources and tools, which includes creation of a text corpus, definition of tasks for information extraction, establishment of finite state models for information extraction, and their application accordingly, iterative enlarging of electronic morphological dictionaries, enrichment and enhancement of WordNet, and creation of new ontologies. Each of these steps is described thoroughly. Even though the model was at first considered as a solution for problems in processing Serbian, it can be equally applied for processing texts written in other languages, with the development of suitable language resources accordingly. The implementation of the above explained steps is described in the fourth chapter, through a system for information extraction from the culinary texts written in Serbian. Then follows the description of a bond in the development and mutual complement of lexical resources through steps in creating domain corpus, identifying culinary lexica, expanding and upgrading of WordNet and electronic morphological dictionaries, and developing of domain ontologies – the food ontology, the approximate measure ontology, and the ontology of ingredients that can be used as mutual replacements in the culinary domain. This system, developed for information extraction, has served for creating an advanced search system which, based on a corpus of culinary texts, generates all possible answers to inquiries made by users. In the frame of this system is implemented a specific method which serves for creation of links between different recipes. This is used in case when the user reviews a text of a recipe and notices that in preparing description features some part which already had appeared in other recipe, but with additional or different explanation. Another contribution of this thesis is application of developed ontologies in tasks that convert approximate measures into standard measures, and establishment of similarities among the recipes. The similarity of the recipes is defined as similarity of texts which describe process of course preparation in accordance with a specific recipe. The last chapter contains final conclusions and directions for future research. URI: http://hdl.handle.net/123456789/4410 ## Files in this item: 1

teza_Stasa.pdf ( 10.38Mb ) -
Stojanović, Sana (Beograd , 2016)[more][less]
Abstract: The advance of geometry over the centuries can be observed through the development of di erent axiomatic systems that describe it. The use of axiomatic systems begins with Euclid, continues with Hilbert and Tarski, but it doesn't end there. Even today, new axiomatic systems for Euclidean geometry are developed. Avigad's axiomatic system goes back to the beginnings and precisely describes basic derivations presented in Euclid's ½Elements . Writing an axiomatic system in a format suitable for computer theorem proving is a challenge in itself. Imprecise formulations of axioms which appear in books get exposed only when they need to be written in a format suitable for computers. The formalization of di erent axiomatic systems and computer-assisted proofs within theories described by them is the main motif of this thesis. The programs for theorem proving have existed since the eighties and today they present a collection of very powerful tools. This thesis presents a system for automated and formal theorem proving which uses the power of resolution theorem provers, a coherent prover, as well as interactive theorem provers for verifying the generated proofs. Coherent prover ArgoCLP is one of the contributions of the thesis. Additionally, the thesis develops a dialect of coherent logic based on natural deduction which enables simple transformation of generated proofs into proofs written in languages of interactive provers Isabelle and Coq as well as in natural languages, English and Serbian. The system for theorem proving is applied to three axiomatic systems of Euclidean geometry, thus illustrating its applicability to both proving the large mathematical theories and veri cation of informal proofs from mathematical textbooks. URI: http://hdl.handle.net/123456789/4416 ## Files in this item: 1

SanaStojanovic.pdf ( 1.885Mb ) -
Simić, Danijela (Beograd , 2017)[more][less]
Abstract: In this thesis is presented interactive formalization of various models of geometry and algebraic methods for automated proving geometry theorems. We present our current work on formalizing analytic (Cartesian) plane geometries within the proof assistant Isabelle/HOL. We give several equivalent definitions of the Cartesian plane and show that it models synthetic plane geometries (using both Tarski’s and Hilbert’s axiom systems). We also discuss several techniques used to simplify and automate the proofs. As one of our aims is to advocate the use of proof assistants in mathematical education, our exposure tries to remain simple and close to standard textbook definitions, but completely formal and mechanically verifiable. This formalization presents the develop of the necessary infrastructure for implementing decision procedures based on analytic geometry within proof assistants. Furthermore, we investigate complex numbers. Deep connections between complex numbers and geometry had been well known and carefully studied centuries ago. Fundamental objects that are investigated are the complex plane (usually extended by a single infinite point), its objects (points, lines and circles), and groups of transformations that act on them (e.g., inversions and Möbius transformations). In this thesis we treat the geometry of complex numbers formally and present a fully mechanically verified development within the theorem prover Isabelle/HOL. We discuss different approaches to formalization and discuss major advantages of the more algebraically oriented approach. Apart from applications in formalizing mathematics and in education, this work serves as a ground for formally investigating various non-Euclidean geometries and their intimate connections. We also present a formalization of part of Tarski axiom system withing Poincare disk model in Isabelle/HOL. Further on, we analyze connections between geometry and polynomials and the use of these connections. In Euclidean geometry, objects and relations between them can be expressed as polynomials. Further, any geometry construction can be expressed by set of polynomials and geometry statements can be proved by using algebraic methods (e.g. the Gröbner bases method or Wu’s method) over that set of polynomials. We describe an implementation of an algorithm in Isabelle/HOL that accepts a term representation of a geometry construction and returns its corresponding set of polynomials. Our further work will be to use the method of Gröbner bases within the Isabelle system on the generated polynomials, in order to prove correctness of the given construction. Furthermore, we investigate how spatial geometry constructions can be presented using polynomials. We investigate two different approaches in deriving those polynomials and then compare efficiency of algebraic provers depending on the approach used. We present a fully automated system for transforming geometry constructions into set of polynomials. Our further work would be to relate these geometry provers with dynamic geometry software and thus make easier for students to use it. URI: http://hdl.handle.net/123456789/4499 ## Files in this item: 1

06062017danijela_doktorat.pdf ( 1.158Mb ) -
Čukić, Ivan (Beograd , 2018)[more][less]
Abstract: There is a big class of problems that require software systems with asynchronously executed components. For example, distributed computations have the distributed nodes that process the data asynchronously to one anot- her, service-oriented architectures need to process separate requests asynchrono- usly, and multi-core and heterogeneous systems need to have multiple separa- te tasks running concurrently to best utilize the hardware. Even ordinary GUI applications need asynchronous components – the user interface needs to be re- sponsive at all times which means that no matter in what state the program is in, it needs to process and react to the input events coming from the user. The necessity of concurrency and asynchronous execution brings in the added com- plexity of the Inversion of Control (IoC) into the system, either through mes- sage passing or through event processing. IoC makes code difficult to develop and reason about, it increases component coupling and inhibits clean functional or object-oriented software design. In this dissertation, a method for solving the problems that IoC introduces is presented. It presents a way to model both synchronous and different types of asynchronous tasks with the continuation monad. The continuation monad serves as a primitive to build more complex control flow structures that mimic the control flow structures of the host programming language. It also allows for building more complex control structures specialized for parallelism, transactional execution, and for simulating functional programming idioms with asynchronous tasks through a generalization of the continuation monad that allows the asynchronous tasks to generate results one at a time. This allows for writing programming systems with asynchronously executed components by writing seemingly synchronous imperati- ve or functional code while leaving it up to the compiler to do all the heavy lifting and convert the written code to asynchronously executed set of tasks. Another benefit of the presented method is that it allows for easier automatic handling of the data lifetime without the need for garbage collection. This method has been successfully applied and tested in several Free/Libre Open Source Software and proprietary real-world software projects used by hun- dreds of millions of people around the world. In this dissertation, an example of a secure project management system is described which is based on a similar system implemented as a part of the KDE Plasma project. This dissertation also contains the important parts of the implementation of the AsynQt library which extends the Qt library, and its concurrency primitive – QFuture class – with functional reactive programming patterns based on the method proposed in this dissertation. URI: http://hdl.handle.net/123456789/4738 ## Files in this item: 1

ivan_cukic_phd.pdf ( 1.328Mb ) -
Stojanović, Boban (Kragujevac, Serbia , 2007)[more][less]
Abstract: Muscles are organs whose primary function is to produce force and motion. There are three types of muscles: skeletal (striated), smooth and hart muscles. Skeletal muscles are attached to bones and can move them voluntarily. There are many daily activities which occur over an extended period of time and during which performances of muscles can be reduced (reduction of maximal force, contraction speed, movement control, etc). Although numerous mathematical models of muscles have been developed, there are only few models which take into account muscle fatigue. Most of the existing muscle fatigue models consider muscle fatigue under specific conditions only. Motivated by the fact that the existing muscle fatigue models are very limited under arbitrary conditions of activation and loading, we here present a new model including muscle fatigue. The proposed model is based on Hill’s phenomenological model consisting of contractile, serial and parallel elastic elements, but now using a fatigue curve under maximal activation and recovery curve as input parameters, in order to predict muscle response under arbitrary loading conditions. Furthermore, an extension of Hill’s model is introduced, in order to take into account different fiber types. Various types of muscle fibers can have very different physiological and mechanical properties, significantly affecting their resistance to fatigue. The developed models are incorporated into the finite element software PAK. The proposed models are verified by comparing the calculated results with experimental measurements and data from literature. By computer modeling of human biceps and triceps muscles, as well as the frog gastrocnemius muscle, it is shown that the models can predict behavior of real muscles with satisfactory precision. Besides application to single muscles, the proposed models can be used for computer simulations of complex musculoskeletal systems. In order to provide efficient modeling of muscles and musculoskeletal systems, a software for automatic muscle generation using medical images has been developed, as well as a module for result post-processing by employing various types of graphs. The proposed models and the developed software can be used as a very powerful tool in designing medical and sport equipment, planning trainings and analyzing exercises. Computer simulations based on the muscle mechanical models can prevent work injuries and significantly reduce costs for individuals and society. URI: http://hdl.handle.net/123456789/1843 ## Files in this item: 1

Boban Stojanovic - Doktorska disertacija.pdf ( 12.75Mb ) -
Ivanović, Miloš (Kragujevac, Srbija , 2010)[more][less]
URI: http://hdl.handle.net/123456789/1838 ## Files in this item: 1

mivanovic-disertacija-pun-tekst.pdf ( 11.57Mb ) -
Mladenović, Miljana (Beograd , 2016)[more][less]
Abstract: The beginning of the new millennium was marked by huge development of social networks, internet technologies in the cloud and applications of artificial intelligence tools on the web. Extremely rapid growth in the number of articles on the Internet (blogs, e-commerce websites, forums, discussion groups, and systems for transmission of short messages, social networks and portals for publishing news) has increased the need for developing methods of rapid, comprehensive and accurate analysis of the text. Therefore, remarkable development of language technologies has enabled their applying in processes of document classification, document clustering, information retrieval, word sense disambiguation, text extraction, machine translation, computer speech recognition, natural language generation, sentiment analysis, etc. In computational linguistics, several different names for the area concerning processing of emotions in text are in use: sentiment classification, opinion mining, sentiment analysis, sentiment extraction. According to the nature and the methods used, sentiment analysis in text belongs to the field of computational linguistics that deals with the classification of text. In the process of analysing of emotions we generally speak of three kinds of text classification: • identification of subjectivity (opinion classification or subjectivity identification) used to divide texts into those that carry emotional content and those that only have factual content • sentiment classification (polarity identification) of texts that carry emotional content into those with positive and those with negative emotional content • determining the strength or intensity of emotional polarity (strength of orientation). In terms of the level at which the analysis of feelings is carried out, there are three methodologies: an analysis at the document level, at the sentence level and at the level of attributes. Standardized methods of text classification usually use machine learning methods or rule-based techniques. Sentiment analysis, as a specific type of classification of documents, also uses these methods. This doctoral thesis, whose main task is the analysis of emotions in text, presents research related to the sentiment classification of texts in Serbian language, using a probabilistic method of machine learning of multinomial logistic regression i.e. maximum entropy method. The aim of this research is to create the first comprehensive, flexible, modular system for sentiment analysis of Serbian language texts, with the help of digital resources such as: semantic networks, specialized lexicons and domain ontologies. This research is divided into two phases. The first phase is related to the development of methods and tools for detecting sentiment polarity of literal meaning of the text. In this part of the work, a new method of reducing the feature vector space for sentiment classification is proposed, implemented and evaluated. The proposed method for reduction is applied in the classification model of maximum entropy, and relies on the use of lexical-semantic network WordNet and a specialized sentiment lexicon. The proposed method consists of two successive processes. The first process is related to the expansion of feature vector space by the inflectional forms of features. The study has shown that usage of stemming in sentiment analysis as a standard method of reducing feature vector space in text classification, can lead to incomplete or incorrect sentiment-polarity feature labelling, and with the introduction of inflectional feature forms, this problem can be avoided. The paper shows that a feature vector space, increased due to the introduction of inflectional forms, can be successfully reduced using the other proposed procedure – semantic mapping of all predictors with the same sentiment-polarity into a small number of semantic classes. In this way, the feature vector space is reduced compared to the initial one, and it also retains the semantic precision. The second phase of the dissertation describes the design and implementation of formal ontologies of Serbian language rhetorical figures – the domain ontology and the task ontology. Usage of the task ontology in generating features representing figurative speech is presented. The research aim of the second phase is to recognize figurative speech to be used in improving of the existing set of predictors generated in the first phase of the research. The research results in this phase show that some classes of figures of speech can be recognized automatically. In the course of working on this dissertation, a software tool SAFOS (Sentiment Analysis Framework for Serbian), as an integrated system for sentiment classification of text in Serbian language, has been developed, implemented and statistically evaluated. Results of the research within the scope of this thesis are shown in papers (Mladenović & Mitrović, 2013; Mladenović & Mitrović, 2014; Mladenović, Mitrović & Krstev, 2014; Mladenović, Mitrović, Krstev & Vitas, 2015; Mladenović, Mitrović & Krstev, 2016). The dissertation consists of seven chapters with the following structure. Chapter 1 introduces and defines methods, resources and concepts used in the first phase of research: text classification, sentiment classification, machine learning, supervised machine learning, probabilistic supervised machine learning, and language models. At the end of the introductory section, the tasks and objectives of the research have been defined. Chapter 2 presents a mathematical model of text classification methods and classification of sentiment methods. A mathematical model of a probabilistic classification and an application of the probabilistic classification in regression models are presented. At the end of the chapter it is shown that the method using the mathematical model of maximum entropy, as one of the regression models, has been successfully applied to natural language processing tasks. Chapter 3 presents the lexical resources of the Serbian language and the methods and tools of their processing. Chapter 4 deals with the comprehensive research on the currently available types and methods of sentiment classification. It shows the current work and research in sentiment classification of texts. It also presents a comparative overview of research in sentiment classification of texts using the method of maximum entropy. Chapter 5 discusses the contribution of this thesis to methods of feature space reduction for maximum entropy classification. First, a feature space reduction method is analysed. A new feature space reduction method which improves sentiment classification is proposed. А mathematical model containing proposed method is defined. Learning and testing sets and lexical-semantic resources that are used in the proposed method are introduced. Chapter 5 also describes building and evaluation of a system for sentiment classification – SAFOS, which applies and evaluates the proposed method of a feature vector space reduction. The parameters and the functions of SAFOS are defined. Also, measures for evaluation of the system were discussed – precision, recall, F1-measure and accuracy. A description of the method for assessing the statistical significance of a system is given. Also, implementation of the statistical test in the system SAFOS is discussed. The chapter provides an overview of the presented experiments, results and evaluation of the system. Chapter 6 deals with methods of recognizing figurative speech which can improve sentiment classification. The notion of domain ontology is introduced, the role of rhetorical figures and domain ontology of rhetorical figures. The importance of figurative speech in the sentiment classification has been explored. The description of the construction and structure of the first domain ontology of rhetorical figures in Serbian language, RetFig.owl, is given. Also, the description of the construction and structure of the corresponding task ontology that contains rules for identification of some classes of rhetorical figures is given. At the end of this chapter, an overview of the performed experiments, results and evaluation of the SAFOS system plugin that improved the recognition of figurative speech is given. The final chapter of this study deals with the achievemnts, problems and disadvantages of the SAFOS system. The conclusion of this thesis points to the great technological, social, educational and scientific importance of the sentiment analysis and recognition of the figurative speech and gives some routes in further development of the SAFOS system. URI: http://hdl.handle.net/123456789/4422 ## Files in this item: 1

Mladenovic_Miljana.pdf ( 13.60Mb ) -
Marovac, Ulfeta (Beograd , 2015)[more][less]
Abstract: Proteins are signi cant biological macromolecules of polymeric nature (polypeptides), which contain amino acids and are basic structural units of each cell. Their contents include 20+3 amino acids and, as a consequence, they are presented in biological databases as sequences formed from 23 di erent characters. Proteins can be classi ed based on their primary structure, secondary structure, function etc. One of possible classi cations of proteins by their function is related to their contents in a certain cluster of ortholologous groups (COGs). This classi cation is based on the previous comparison of proteins by their similarities in their primary structures, which is most often a result of homology, i.e. their mutual (evolutionary) origin. COG database is obtained by comparison of the known and predicted proteins encoded in the completely sequenced prokaryotic (archaea and bacteria) genomes and their classi cation by orthology. The proteins are classi ed in 25 categories which can be ordered in three basic functional groups (the proteins responsible for: (1) information storage and processing; (2) cellular processes and signaling; and (3) metabolism), or in a group of poorly characterized proteins. Classi cation of proteins by their contents in certain COG category (euKaryote Orthologous Groups- KOG for eukaryotic organisms) is signi cant for better understanding of biological processes and various pathological conditions in people and other organisms. The dissertation proposed the model for classi cation of proteins in COG categories based on amino acid n-grams (sequences of n- length). The set of data contains protein sequences of genomes from 8 di erent taxonomic classes [TKL97] of bacteria (Aqui cales, Bacteroidia, Chlamydiales, Chlorobia, Chloro exia, Cytophagia, Deinococci, Prochlorales), which are known to have been classi ed by COG categories. The new method is presented, based on the generalized systems of Boolean equations, used for separation of n-grams characteristic for proteins of corresponding COG categories. The presented method signi cantly reduces the number of processed n-grams in comparison to previously used methods of n-gram analysis, iv thus more memory space is provided and less time for protein procession is necessary. The previously known methods for classi cation of proteins by functional categories compared each new protein (whose function had to be determined) to the set of all proteins which had already been classi ed by functions in order to determine the group which contained most similar proteins to the one which was to be classi ed. In relation to the previous, the advantage of the new method is in its avoidance of sequence-sequence comparison and in search for those patterns (n-grams, up to 10 long) in a protein which are characteristic of the corresponding COG category. The selected patterns are added to a corresponding COG category and describe sequences of certain length, which have previously appeared in that COG category only, not in the proteins of other COG categories. On the basis of the proposed method, the predictor for determination of the corresponding COG category for a new protein is implemented. Minimal precision of the prediction is one of the predictors arguments. During the test phase the constructed predictor shown excellent results, with maximal precision of 99% reached for some proteins. According to its properties and relatively simple construction, the proposed method can be applied in similar domains where the solution of problem is based on n-gram sequence analysis. URI: http://hdl.handle.net/123456789/4308 ## Files in this item: 1

phdUlfetaMarovac.pdf ( 7.954Mb ) -
Zeković, Ana (Beograd , 2015)[more][less]
Abstract: A main focus of the paper is construction of new methods for defining diverse knot distance types - the distance of knots made by crossing changes (Gordian distance) and the distance among knots made by crossing smoothing (smoothing distance). Different ways of knots presentation are introduced, with objective to a mirror curve model. It is presented a purpose of the model, coding of knots, by using the model preferences, as well as introduction of a method to determinate a knots presented by the model and derived all the knots that could be placed to a nets dimensions p×q (p ≤ 4, q ≤ 4). Diverse knot notations are described into details, with a focus to Conway’s notation and its topological characteristics. As it is known, a present algorithms are based on an algebra of chain fractions, that are in close relation with a presentation of rational knots, which results in an absence of a huge number of non-rational knots, in an existing Gordian’s distance tables. The subject of the paper is an implementation of methods with bases on determination of new distances equal 1. The methods are based on a non-minimal presentation of rational and non-rational knots, generation of algorithms established on geometrical characteristics of Conway’s notation and a weighted graph search. The results are organized into Gordian’s distance knots tables up to 9 crossings, and have been enclosed with the paper. In order to append the table with knots having a bigger number of crossings, it has been suggested a method for extension of results for knot families. Using facts of relation among Gordian’s numbers and smoothing numbers, a new method for smoothing number determination is presented, and results in a form of lists for knots not having more then 11 crossings. In conjunction with Conway’s notation concept and the method, algorithms for a smoothing distance are generated. New results are organized in knot tables, up to 9 crossings, combined with previous results, and enclosed with the paper. A changes and smoothing to a knot crossing could be applied for modeling topoisomerase and recombinase actions of DNA chains. It is presented the method for studying changes introduced by the enzymes. A main contribution to the paper is the concept of Conways notation, used for all relevant results and methods, which led to introduction of a method for derivation a new knots in Conways notation by extending C-links. In a lack of an adequat pattern for an existing knot tables in DT-notation, there is usage of a structure based on topological knot concepts. It is proposed a method for knot classification based on Conways notation, tables of all knots with 13 crossings and alternated knots with 14 crossings has been generated and enclosed. The subject of the paper takes into consideration Bernhard-Jablan’s hypothesis for a determination of unknotting number using minimal knot diagrams. The determination is crucial in computation of diverse knot distances. The paper covers one of main problems in knot theory and contains a new method of knot minimization. The method is based on relevance of local and global minimization. 5 There are defined new terms such as a maximum and a mixed unknotting number. The knots that do not change a minimum crossing number, after only one crossing change are taken into consideration for the analyzes. Three classes of the knots are recognized, and called by authors . Kauffman’s knots, Zekovic knots and Taniyama’s knots. The most interesting conclusion correlated with Zekovic knots is that all derived Perko’s knots (for n ≤ 13 crossings) are actually Zekovic knots. Defining this class of knots provides opportunity to emphasize new definitions of specifis featured for well-known Perko’s knots. URI: http://hdl.handle.net/123456789/4255 ## Files in this item: 1

phdZekovicAna.pdf ( 5.246Mb ) -
Dražić, Zorica (Beograd , 2014)[more][less]
Abstract: The Variable neighborhood search method proved to be very successful for solving discrete and continuous optimization problems. The basic idea is a systematic change of neighborhood structures in search for the better solution. For optimiza- tion of multiple variable functions, methods for obtaining the local minimum starting from certain initial point are used. In case when the continuous function has many local minima, nding the global minimum is usually not an easy task since the obta- ined local minima in most cases are not optimal. In typical implementations with bounded neighborhoods of various diameters it is not possible, from arbitrary point, to reach all points in solution space. Consequently, the strategy of using the nite number of neighborhoods is suitable for problems with solutions belonging to some known bounded subset of IRn. In order to overcome the previously mentioned limitation the new variant of the method is proposed, Gaussian Variable neighborhood search method. Instead of de ning the sequence of di erent neighborhoods from which the random point will be chosen, all neighborhoods coincide with the whole solution space, but with di e- rent probability distributions of Gaussian type. With this approach, from arbitrary point another more distant point is theoretically reachable, although with smaller probability. In basic version of Variable neighborhood search method one must de ne in advance the neighborhood structure system, their number and size, as well as the type of random distribution to be used for obtaining the random point from it. Gaussian Variable neighborhood search method has less parameters since all the neighborhoods are theoretically the same (equal to the solution space), and uses only one distribution family - Gaussian multivariate distribution with variable dispersion. File transfer scheduling problem (FTSP) is an optimization problem widely appli- cable to many areas such as Wide Area computer Networks (WAN), Local Area Ne- v tworks (LAN), telecommunications, multiprocessor scheduling in a MIMD machines, task assignments in companies, etc. As it belongs to the NP-hard class of problems, heuristic methods are usually used for solving this kind of problems. The problem is to minimize the overall time needed to transfer all les to their destinations for a given collection of various sized les in a computer network, i.e. to nd the le transfer schedule with minimal length. In order to obtain the exact solution of the FTS problem, integer linear pro- gramming formulations are proposed and their correctness is proved. In this way optimal solutions can be found for small and medium size test instances. For large test instances, the Variable neighborhood search method is proposed using the "permutation" representation and typical neighborhood structures. More- over, the same method is used for obtaining the upper bounds of the solutions which are used in proposed integer linear programming formulations. For obtaining be- tter solutions in the small neighborhood of the current solution, three di erent local search procedures are implemented: 2-swap, 2-swap adjacent and variable neighbo- rhood descent. In order to apply the continuous optimization methods for solving FTSP, the weighted solution representation is developed. Such representation enables the co- ntinuous optimization methods to be used, which do not require the di erentiability of objective function. Since Gauss Variable neighborhood search method proved to be successful in continuous optimization problems, it was applied to FTSP. Pre- viously described local search procedures can also be used with weighted solution representation. Using the proposed methods optimal solutions for all small and medium size test instances are found. For large size instances, which are beyond the reach of exact methods, metaheuristic methods obtained good solutions in reasonable time. URI: http://hdl.handle.net/123456789/4246 ## Files in this item: 1

phdDrazic_Zorica.pdf ( 4.739Mb ) -
Vučković, Bojan (Beograd , 2017)[more][less]
Abstract: We present original results from the following fields of discrete mathematics: chromatic graph theory, extremal set theory and Boolean matrix theory. From the chromatic graph theory we investigate edge and total colorings satisfying the condition that neighboring vertices of a graph possess different values of multi-set, set or sum, induced by the giving coloring. Multi-set neighbor-distinguishing edge coloring of a graph is an assignment of colors to edges such that, for every edge uv of a graph, multi-set of the edges incident with the vertex u differs from the multi-set of the edges incident with the vertex v. The previous best result concerning the minimum number of colors required for such a coloring of an arbitrary graph states that four colors are sufficient. The author’s contribution is a proof that such a coloring is always possible with only three colors, which is in general case the optimal number of colors. We construct a graph for which we subsequently prove that a different number of colors is required to obtain a multi-set neighbor-distinguishing coloring and neighbor-distinguishing coloring by sum. As far as we know, this is the first example of such a graph. A few results concerning the neighbor expended sum distinguishing coloring are given. The main contribution is a proof that for an arbitrary graph there exists a total coloring from the set f1; 2; 3g, such that every two adjacent vertices have different sums of its adjacent vertices and incident edges. Also, for certain classes of graphs is proved that there exists such a coloring using only the colors from the set f1; 2g. Neighbor-distinguishing edge coloring of a graph G requires that every two adjacent edges receive different colors, while the sets of the edges incident with the vertices u and v differ for every edge uv of G. The author presents a procedure of edge coloring for an arbitrary graph without isolated edges, where we a smaller number of colors is used compared to all known results. For the adjacent vertex distinguishing total coloring of a graph G the condition is that every two adjacent and incident elements of V (G) [ E(G) receive different colors, while for every edge uv of G the set composed from the colors assigned to the edges incident with u together with the color of u, differs from such a set for v. The author improves the upper bound of the minimum number of colors needed for such a coloring, relative to the maximal degree of a graph. Frankl’s conjecture from the extremal set theory states that for every family closed under union there exists an element contained in at least half of the sets of the family. We give a proof that Frankl’s conjecture holds for every family contained from 12 elements, while it is known that this is true for families contained from 11 or less elements. Our proof is based on the efficient algorithm that exhausts all the possibilities, while using the results for subfamilies that eventual counter-example cannot contain, which we obtained in a number of consecutive steps. Family of sets G is an FC-family if for every family F containing G there exists an element from S G that appears in at least half of the sets of F. NonFC-family is every family that is not FC. The author’s contribution is the complete classification of all families consisting of 6 or less elements into FC and NonFC-families. From the Boolean matrices theory we present our results concerning the row space cardinality. Boolean matrices are the matrices whose all components are from the set f0; 1g, while the row space of a Boolean matrix is the set of vectors that can be obtained by disjunction from the rows of a matrix. We present the set consisted of all values a from the interval [2n2 + 2n3; 2n2] such that there exists a matrix of dimension n n having the row space cardinality equal to a. For the least positive integer an for which there exists no matrix of dimension n n having the row space cardinality equal to an, the author gives a lower bound that is an improvement over the previously known results. All proofs for the main results in the dissertation are constructive. Proofs of some of them require the use of computers where there is a calculation of a great number of possibilities. For other proofs this was not necessity, though algorithms following the steps of the proofs can be implemented to obtain a graph coloring or a matrix with the desired properties. URI: http://hdl.handle.net/123456789/4661 ## Files in this item: 1

Disertacija_-_Bojan_Vuckovic.pdf ( 1.143Mb ) -
Filipović, Vladimir (Beograd , 2006)[more][less]
URI: http://hdl.handle.net/123456789/2478 ## Files in this item: 1

phdFilipovicVladimir.pdf ( 2.145Mb ) -
Radojčić, Nina (Beograd , 2018)[more][less]
Abstract: In this dissertation, three NP-hard optimization problems are studied and va- rious computational intelligence methods are considered for solving them, with a special emphasis on the possibilities of applying fuzzy logic in order to improve the performances of proposed methods. In addition, it is shown how fuzzy logic can be incorporated into a model to make it more adequate to real world applications. The first problem considered is the Risk-Constrained Cash-in-Transit Vehicle Routing Problem (RCTVRP) that represents a special case of the vehicle routing problem (VRP). Similar to the classical VRP, the aim is to determine the collection routes from one depot to a number of customers in order to minimize the overall travel distance (or cost). Additionally, the safety aspect of the routed risk constraints are introduced in the case of RCTVRP. The RCTVRP concerns the issue of secu- rity during the transportation of cash or valuable goods (e.g. in the cash-in-transit industry). The other two problems studied in this dissertation belong to the class of loca- tion problems: the Load Balancing Problem (LOBA) and the Max-Min Diversity Problem (MMDP). The goal of the LOBA problem is to locate a fixed number of facilities, such that the difference between the maximum and minimum number of customers served by each facility is balanced. The LOBA model is useful in cases where customers naturally choose the closest facility. The MMDP consists of se- lecting a subset of a fixed number of elements from a given set in such a way that the diversity among the selected elements is maximized. This problem also arises in real world situations encompassing a variety of fields, particularly the social and biological sciences. In order to solve the RCTVRP, a fuzzy GRASP (Greedy Randomized Adaptive Search Procedure) is hybridized with Path Reliking (PR) methodology. Carefully adjusted fuzzy modification incorporated into the proposed GRASP for the RC- TVRP improved its performance. Moreover, in this dissertation a new PR structure is implemented and can be used for other vehicle routing problems. To improve the algorithm’s time complexity, a new data structure for the RCTVRP is incor- porated. The proposed fuzzy GRASP with PR hybrid shows better computational performance compared to its non-fuzzy version. Furthermore, computational results on publicly available data sets indicate that the proposed algorithm outperforms all existing methods from the literature for solving the RCTVRP. For solving the LOBA problem two efficient hybrid metaheuristic methods are proposed: a combination of reduced and standard variable neighborhood search met- hods (RVNS-VNS) and hybridization of evolutionary algorithm and VNS approach (EA-VNS). The proposed hybrid methods are first benchmarked and compared to all the other methods on existing test instances for the LOBA problem with up to 100 customers and potential suppliers. In order to test the effectiveness of the pro- posed methods, we modify several large-scale instances from the literature with up to 402 customers and potential suppliers. Exhaustive computational experiments show that the proposed hybrid methods quickly reach all known optimal solutions while providing solutions on large-scale problem instances in short CPU times. Re- garding solution quality and running times, we conclude that the proposed EA-VNS approach outperforms other considered methods for solving the LOBA problem. EA approach is also proposed for solving the MMDP. Computational experi- ments on a smaller benchmark data set showed that the classic EA quickly reached all optimal solutions obtained previously by an exact solver. However, some of the larger instances of MMDP were challenging for classic EA. Although researchers have established the most commonly used parameter setting for EA that has good performance for most of the problems, it is still challenging to choose the adequate values for the parameters of the algorithm. One approach to overcome this is chan- ging parameter values during the algorithm run. As part of this dissertation this problem was addressed by extending the evolutionary algorithm by adding a fuzzy rule formulated from EA experts’ knowledge and experience. The implemented fuzzy rule changes the mutation parameter during the algorithm run. The results on tested instances indicate that the proposed fuzzy approach is more suitable for solving the MMDP than classic EA. For all three problems addressed whereas the smaller instances that CPLEX was able to solve, obtained optimal solutions were used for comparison with proposed methods and all of the proposed methods obtained these optimal solutions. Moreover, in this dissertation it has been shown that fuzzy logic is a successful tool in modeling the RCTVRP. In this problem the risk constraints are set by using a risk threshold T on each route and thus, the routes with risk larger than T are forbidden. However, in this dissertation the aim is to take into account the probability of being robbed along each route instead of just allowing solutions with routes that satisfy the risk constraints. A new fuzzy model for the RCTVRP is developed which takes into account the value of the risk index of each route and the solutions with lower values of risk indexes on their routes are considered superior. In order to achieve that fuzzy numbers are used in the improved model. Moreover, two mixed integer program formulations of new fuzzy model are developed and presented in this dissertation. The introduced fuzzy model is compared with the model from the literature using an adequate example and the advantages of the newly proposed fuzzy RCTVRP is demonstrated. Computational experiments are performed and the comparison of the two models given in the paper show that the newly presented approach leads to safer routes. URI: http://hdl.handle.net/123456789/4737 ## Files in this item: 1

tezaNinaRadojicic.pdf ( 1.665Mb ) -
Kartelj, Aleksandar (Beograd , 2014)[more][less]
Abstract: This work investigates the potential of improving the classi cation process through solving three classi cation-related problems: feature selection, feature weighting and parameter selection. All three problems are challenging and currently in the focus of scienti c researches in the eld of machine learning. Each problem is solved by using population-based metaheruistic method called electromagnetismlike method. This method is used for combinatorial and global optimization. It is inspired by laws of attraction and repulsion among charged particles. Each particle is represented by a vector of real values. The solution of the problem of interest is then obtained by mapping these real-valued vectors to the feasible solution domain. Particles representing better solutions achieve higher level of charge, which consequently produces greater impact on other particles. The search process is performed by iterating the particle movement, induced by charges. Through implementing the methods, two key aspects are managed: 1) the classi cation quality obtained after applying the optimization method and 2) the e ciency of the proposed methods from the perspective of time and space resources. All methods are equiped with problem-speci c local search procedures which tend to increase the solution quality. The bene t of applying feature selection for the classi cation process is twofold. Firstly, the elimination of unnecessary features decreases the data set noise, which degrades the quality of the classi cation model. Secondly, the problem dimension is decreased, thus the e ciency is increased. Feature selection problem is very e - ciently solved by the proposed method. The classi cation quality is in the majority of cases (instances) improved relative to the methods from literature. For some of the instances, computational times are up to several hundred times smaller than those of the competing methods. Feature weighting and parameter selection problem share similar underlying solution representation, based on the vectors of real values. Since the representation of charged particles is based on the same underlying domain, the transition from the particle to the solution domain behaves smoothly. The quality of the method for iv feature weighting is demonstrated through nearest neighbors classi cation model. The testing of the method is conducted on di erent collection of instances, and after that, the comparison to several methods from literature is made. In the majority of cases, the proposed method outperformed the comparison methods. The parameter selection, in classi cation, has a great impact on the classi cation quality. The proposed method for parameter selection is applied on the support vector machihe, which has a complex parametric structure when the number of parameters and the size of their domains is in question. By using heuristic initialization procedure, the detection of high quality regions for parameter combinations is accelerated. Exhaustive tests are performed on various instances in terms of their dimension and feature structure: homogenous and heterogeneous. Single kernel learning is adopted for homogenous, and multiple kernel learning for heterogeneous instances. The comparison with methods from literature showed superiority of the proposed method when single and multiple kernel learning based on radial basis function is considered. The method shows to be competitive in other cases. All proposed methods improved the classi cation quality. Because of the way, the problem is being solved, all three methods can be generalized and applied to a wide class of classi cation models and/or classi cation problem. URI: http://hdl.handle.net/123456789/4234 ## Files in this item: 1

phdAleksandarKartelj.pdf ( 2.121Mb ) -
Perović, Vladimir (Beograd , 2013)[more][less]
Abstract: Although long-range intermolecular interactions (interactions acting on distances >5Å) play an important role in recognition and targeting between molecules in biological systems, there is no one appropriate software package allowing use of this important property in investigation of biologically active molecules. The multifunctional EIIP/ISM software, which is based on physical parameters determining long-range molecular properties, was developed in this thesis. This novel and unique platform allows (i) investigation of protein-protein and protein-small molecule interactions, (ii) analysis of structure/function relationship of proteins, (iii) assessment of biological effects of mutations in proteins, (iv) monitoring of the functional evolution of proteins, (v) ―de novo‖ design of molecules with desired biological function and (vi) selection of candidate therapeutic molecules. Results of application of the EIIP/ISM platform on diverse problems (e.g. the evolution of influenza A viruses, assessment of biological effects of mutations on the LPL protein, representing a risk factor for cardiovascular diseases, identification of therapeutic targets for HIV and influenza viruses, virtual screening of molecular libraries for candidate antibiotics and anti-HIV drugs) which are presented in this thesis, confirm the applicability of this platform on broad spectrum of problems in molecular biology, biomedicine and pharmacology. URI: http://hdl.handle.net/123456789/4230 ## Files in this item: 1

phdPerovic_Vladimir.pdf ( 11.95Mb )

Now showing items 1-20 of 28

Next Page