Auflistung Computer Science nach Titel
-
Tanasijević, Ivana (Beograd , 2020)[more][less]
Zusammenfassung: The motivation for writing this doctoral dissertation is a multimedia col-lection that is the result of many years of field research conducted by researchers from the Institute for Balkan studies of the Serbian Academy of Sciences and Arts. The collection consists of materials in the form of recorded interviews, various recorded customs, associated textual descriptions (protocols) and numerous other documents.The subject of research of this dissertation is the study of possibilities and the development of new methods that could be used as a starting point in solving the problem of managing the intangible cultural heritage of the Balkans. The subtasksthat emerge in this endeavor are the development of adequate design and implemen-tation of a multimedia database of intangible cultural heritage that would meet the needs of different types of users, automatic semantic annotation of protocols using natural language processing methods, as a basis for semi-automatic annotation of the multimedia collection, and successful search by metadata which comply with the CIDOC CRM standard, study of additional search possibilities of this collection in order to gain new knowledge, as well as development of selected methods.The main problem with the available methods is that there is still not enough developed infrastructure in the context of natural language processing, organization and management in the field of cultural heritage in the Balkans and especially for the Serbian language, which could be effectively used to solve the proposed problem.There is thus a strong need to develop methods to reach an appropriate solution.For the semi-automatic annotation of multimedia materials, automatic semantic annotation of the protocols associated with the materials was used. It was carriedout by methods of information extraction, recognition of named entities and topicextraction, using rule-based techniques with the help of additional resources suchas electronic dictionaries, thesauri and vocabularies from a specific domain.To classify textual protocols in relation to the topic, research was conducted onmethods that can be used to solve the problem of classifying texts in the Serbianlanguage, and a method was offered that is adapted to the specific domain beingprocessed (intangible cultural heritage), to the specific problems being solved (clas-sification of protocols in relation to the topic) and to the Serbian language, as one of the morphologically rich languages.To work with spatial data, a spatial model has been developed that is suitable for displaying results on a map, as well as for creating spatial queries through an interactive graphical display of a map of locations.The results of experiments conducted on the developed methods show that the use of a rule-based approach in combination with additional language resources an dwith putting in a reasonable amount of effort gives very good results for the task of information extraction. An F measure of 0.87 was reached for the extraction of named entities, while an F measure of 0.90 was reached for the extraction of topics,which is in the range of measures from published research from similar problem sand domains.The results of the text classification indicate that the selected statistical methods of machine learning in their basic form when applied to the protocols, although generally successful, give a bad F measure, 0.44, while significant improvement is achieved with the use of semantic techniques, in which case an F measure of 0.88 is reached.Some of the results presented in this dissertation are contained in the papers[266], [265], [94], [264], [267], which have been published or accepted for publication.The conclusion drawn from the research is that to solve the given problem it is necessary to engage experts from several fields, that the needs of different groups of users are complex, which complicates the task of organizing and managing them ultimedia collection, that the domain of cultural heritage is very rich in semantics,that context plays a major role in the tasks of information extraction and text classification, and finally that for these tasks the developed rule-based methods of natural language processing as well as statistical techniques of machine learning prove to be successful. URI: http://hdl.handle.net/123456789/5093 Dateien zu dieser Ressource: 1
IvanaTanasijevicdr.pdf ( 4.674Mb ) -
Džamić, Dušan (Beograd , 2021)[more][less]
Zusammenfassung: The theory of complex networks has proven to be very important inthe study of the characteristics and structure of various complex systems. In thelast two decades, a large number of researches have been directed towards thedevelopment of methods for clustering in complex networks. In 2012, Center forDiscrete Mathematics and Theoretical Computer Science (DIMACS), which is awell-known consortium of prestigious academic institutions (Rutgers University,Princeton, Colombia) and research laboratories (Microsoft, IBM, AT & T, NEC),included the problem of clustering in complex networks on the list of the mostimportant problems and challenges in computer science. Clustering in complexnetworks can be applied in a variety of contexts to achieve different goals, andtherefore, there is no generally accepted definition of a cluster. For this reason,different approaches are used in developing clustering methods. An approach thathas attracted the most attention of researchers involves two subproblems: defininga function to determine the quality of a partition and constructing methods to finda partition that has the maximum value of the defined quality function. In thisapproach, the problem of clustering is formulated as the problem of combinatorialoptimization and various methods of mathematical optimization can be used tosolve it. One of the most commonly used quality function is the modularity.Clustering by modularity maximization, i.e., finding a partition with the max-imum value of modularity, is NP-hard problem. Thus, only heuristic algorithmsare suitable of processing large datasets. In this dissertation, a novel method formodularity maximization based on the variable neighborhood search heuristic isproposed. For the purpose of efficient application in large-scale complex networks,a procedure for decomposition of the problem into smaller subproblems is devel-oped. In addition, a mechanism for overcoming local maxima of modularity isimproved using criteria for occasional acceptance of solution which is worse thanthe current one. DIMACS instances are used to test the proposed method, and theobtained results are compared with the best ones presented in the literature, ob-tained by two methods developed in DIMACS implementation challenge in 2012.In addition, the obtained results are compared with the results of six methodsdeveloped after 2012, from the literature. A comparative analysis of the resultsshows that the proposed method outperforms the existing methods for modularitymaximization and improves the best known solutions on 9 out of 13 hard instances. Clustering by modularity maximization is not suitable for detecting small clus-ters in large networks. For this reason, a new function for measuring the qualityof a partition has been proposed in the dissertation. Through three theorems, itis shown that the new measure, called E-function, has the potential to identifyclusters in the network and overcome limits of modularity. For testing the pro-posed E-function and comparison with the modularity function, a generic variableneighborhood method is developed to optimize the considered quality function.Computational experiments are performed on generated and real instances fromthe literature for which the correct division into clusters is known. The resultsshow that the expected clusters can be identified, both on artificial and real in-stances, by maximizing the E-function. URI: http://hdl.handle.net/123456789/5214 Dateien zu dieser Ressource: 1
teza_dzamic.pdf ( 3.074Mb ) -
Vučković, Bojan (Beograd , 2017)[more][less]
Zusammenfassung: We present original results from the following fields of discrete mathematics: chromatic graph theory, extremal set theory and Boolean matrix theory. From the chromatic graph theory we investigate edge and total colorings satisfying the condition that neighboring vertices of a graph possess different values of multi-set, set or sum, induced by the giving coloring. Multi-set neighbor-distinguishing edge coloring of a graph is an assignment of colors to edges such that, for every edge uv of a graph, multi-set of the edges incident with the vertex u differs from the multi-set of the edges incident with the vertex v. The previous best result concerning the minimum number of colors required for such a coloring of an arbitrary graph states that four colors are sufficient. The author’s contribution is a proof that such a coloring is always possible with only three colors, which is in general case the optimal number of colors. We construct a graph for which we subsequently prove that a different number of colors is required to obtain a multi-set neighbor-distinguishing coloring and neighbor-distinguishing coloring by sum. As far as we know, this is the first example of such a graph. A few results concerning the neighbor expended sum distinguishing coloring are given. The main contribution is a proof that for an arbitrary graph there exists a total coloring from the set f1; 2; 3g, such that every two adjacent vertices have different sums of its adjacent vertices and incident edges. Also, for certain classes of graphs is proved that there exists such a coloring using only the colors from the set f1; 2g. Neighbor-distinguishing edge coloring of a graph G requires that every two adjacent edges receive different colors, while the sets of the edges incident with the vertices u and v differ for every edge uv of G. The author presents a procedure of edge coloring for an arbitrary graph without isolated edges, where we a smaller number of colors is used compared to all known results. For the adjacent vertex distinguishing total coloring of a graph G the condition is that every two adjacent and incident elements of V (G) [ E(G) receive different colors, while for every edge uv of G the set composed from the colors assigned to the edges incident with u together with the color of u, differs from such a set for v. The author improves the upper bound of the minimum number of colors needed for such a coloring, relative to the maximal degree of a graph. Frankl’s conjecture from the extremal set theory states that for every family closed under union there exists an element contained in at least half of the sets of the family. We give a proof that Frankl’s conjecture holds for every family contained from 12 elements, while it is known that this is true for families contained from 11 or less elements. Our proof is based on the efficient algorithm that exhausts all the possibilities, while using the results for subfamilies that eventual counter-example cannot contain, which we obtained in a number of consecutive steps. Family of sets G is an FC-family if for every family F containing G there exists an element from S G that appears in at least half of the sets of F. NonFC-family is every family that is not FC. The author’s contribution is the complete classification of all families consisting of 6 or less elements into FC and NonFC-families. From the Boolean matrices theory we present our results concerning the row space cardinality. Boolean matrices are the matrices whose all components are from the set f0; 1g, while the row space of a Boolean matrix is the set of vectors that can be obtained by disjunction from the rows of a matrix. We present the set consisted of all values a from the interval [2n2 + 2n3; 2n2] such that there exists a matrix of dimension n n having the row space cardinality equal to a. For the least positive integer an for which there exists no matrix of dimension n n having the row space cardinality equal to an, the author gives a lower bound that is an improvement over the previously known results. All proofs for the main results in the dissertation are constructive. Proofs of some of them require the use of computers where there is a calculation of a great number of possibilities. For other proofs this was not necessity, though algorithms following the steps of the proofs can be implemented to obtain a graph coloring or a matrix with the desired properties. URI: http://hdl.handle.net/123456789/4661 Dateien zu dieser Ressource: 1
Disertacija_-_Bojan_Vuckovic.pdf ( 1.143Mb ) -
Filipović, Vladimir (Beograd , 2006)[more][less]
URI: http://hdl.handle.net/123456789/2478 Dateien zu dieser Ressource: 1
phdFilipovicVladimir.pdf ( 2.145Mb ) -
Pejović, Aleksandar (University of Novi Sad, Faculty of Technical Sciences , 2020)[more][less]
Zusammenfassung: This dissertation is about the development of a parallel software system for representing and solving problems of finite model theory and its application. The theoretical foundation of the system is presented, as well as an in-depth explanation of the implementation in Python. In particular, a parallel method for computing Boolean expressions based on the properties of finite free Boolean algebras is developed. It is also shown how various finite combinatorial objects can be coded in the formalism of Boolean algebras and counted by this procedure. Specifically, using a translation of first order predicate formulas to propositional formulas, we developed a technique for constructing and counting finite models of first order theories. Finally, we have developed some general techniques that enable more effective use of our system. We illustrate these techniques on two examples. The first one deals with partial orders, while the other one is about random graphs. URI: http://hdl.handle.net/123456789/4857 Dateien zu dieser Ressource: 3
APejovicDissertation.pdf ( 1.253Mb )APejovicPhDPresentation.pdf ( 1.103Mb )APejovicPhDSoftwareSources.zip ( 25.25Kb ) -
Carić, Marko (Beograd , 2023)[more][less]
Zusammenfassung: This dissertation discusses the problem of calculating the number of equivalent classes of a Boolean function. The difficulty of determining the number of equivalence classes increases sharply with the number of variables sn. The motivation for choosing this topic lies in the fact that concrete numbers have been known so far only for relatively small values of n, although the problem itself has long been theoretically solved. Let G be the permutation group of the set Bn={0,1}n. Effect of Gonscalar group, Bn→B1, that is, vector invertible Boolean functions, Bn→Bn. Two scalar Boolean functions f(k)ig(k), defined by Bn, are considered equivalent. ∈G forever ∈Bnf(k )=g(σ(k)) holds. Two vector invertible Boolean functions f(k) and g(k) are considered equivalent in relation to the group G, i.e. ))).The equivalence relation∼decomposes the sets of all Boolean functions into equivalence classes.The equivalence of Boolean functions confers significant applications in the logical synthesis of combinatorial circuits and cryptography, especially in connection with the design of S-boxes. Let Un(G) and Vn(G) denote the number of scalar equivalence classes, ie. vector invertible Boolean functions of n variables with respect to the group G. The numbers Un(G) and Vn(G) can be calculated relatively simply if the cycle index from the group G to the G group is known. Induced by the group Snpermutations of coordinate elements step = (k1,k2,...,kn)∈Bn, •group Gn, induced by permutations and additions of coordinates, •group GLnlinear invertible transformations of elements of vector space Bn, and •group AGLof invertible transformations of elements Bn. If the permutation σ∈Ghasikcycles of lengthk1, its cycle structure is(σ)=(i1,i2,...).The cyclic index of the groupGisgeneratrix ZG(f1,f2,...)=1 |G|σ∈Gk1 fik k structure cycles σene structures of cycles σfraconstructionsGsides ∗pressuresGpermutations of the red group are known, but the cycle indexes itself, i.e. numbers Un(G) and Vn(G), are practically calculated only for relatively small values, for example n10. The dissertation presents original results in the field of enumeration of equivalence classes of Boolean functions in relation to these four groups of soft transformations. A similar expression is derived for all four groups of soft transformations for the cycle index in the form of the sumover partition softhennumbern. Based on the precyclical index much more. An overview of known results for relatively small and new results in the thesis for larger ones are shown in the following table: Number\GSnGnGLnAGLn Un(G)11→3310→328→3110→31 Vn(G)6→307→276→266, participates in direct effect, by special effect 26 latingthenumberofequivalenceclassesthatdoesnotuseacycleindex is shown and described in the third paper from the introductory chapter. These second parts of the dissertation refer to monotone Boolean functions — scalar Boolean functions that satisfy the condition of monotonicity (from which follows f(k)f(i)). Letrn, i.e. monotoneBooleanfunctionsofnvariables. The difficulty of calculating the number rn increases rapidly with n, so that until recently the last term of the sequence to be calculated was r7. The procedure described in the dissertation is based on Frobenius' theorem, on the basis of which the number r8 was determined. The dissertation consists of the first introductory chapter and the following three chapters. In the second chapter, theoretical concepts related to the material from chapters 3 and 4 are introduced, and they relate to discrete mathematics, combinatorics and the index of the cycle considered by the soft cycle of the 3rd form for describing the 3rd cycle. indices for the four considered groups of permutations, as well as the numbers Un(G) and Vn(G) of the equivalent class of the Boolean function with respect to these groups. First, the common improvements for all four groups are discussed, and then the specific accelerations related to individual groups are published by group. ter. In Chapter 4, the problem of finding the number of equivalence classes of a monotone Boolean function is solved. First, a general expression for calculating a number is given based on the Frobenius theorem of the form of the sum (by the soft number partition of the number) of the number from the fixed part of the graph corresponding to the point corresponding to the point. In different partitions, different ways of calculating the number of non-fixed points for n8 are shown. The procedure based on which the number r8 was calculated, which also represents the original contribution of this dissertation, is shown in the first paper from the list from pp. 1 to 3. practically at the same time the obtained result described in the dissertation. URI: http://hdl.handle.net/123456789/5571 Dateien zu dieser Ressource: 1
MarkoCaricDisertacija.pdf ( 1.467Mb ) -
Maljković, Mirjana (Beograd , 2021)[more][less]
Zusammenfassung: Proteins are linear biological polymers composed of amino acids whose structure and function are determined by the number and order of amino acids. The structure of the protein has three levels: primary, secondary and ter- tiary (three-dimensional, 3D) structure. Since the experimental determination of protein 3D structure is expensive and time-consuming, it is important to develop predictors of protein 3D structure properties from the amino acid sequence (pri- mary structure), such as 3D structure of the protein backbone. The 3D structure of the backbone can be described using prototypes of local protein structure, i.e. prototypes of protein fragments with a length of few amino acids. A set of local structure prototypes determines the library of local protein structures, also called the structural alphabet. A structural alphabet is defined as a set of N proto- types of L amino acid length. The subject of this dissertation is the development of models for the prediction of structural alphabet prototypes for a given amino acid sequence using different data mining approaches. As one of the most known, structural alphabet Protein Blocks (PBs) was used in one part of the doctorial re- search. Structural alphabet PBs consists of 16 prototypes that are defined using fragments of 5 consecutive amino acids. The amino acid sequence is combined with the structural properties of a protein that can be determined based on amino acid sequence (occurrence of repeats in the amino acid sequence) and results of predictors of protein structural properties (backbone angles, secondary structures, occurrence of disordered regions, accessible surface area of amino acids) as an input to the prediction model of structural alphabet prototypes. Besides the de- velopment of models for prediction of prototypes of existing structural alphabet, the analysis of the capability of developing new structural alphabets is researched by applying the TwoStep clustering algorithm and construction of models for the prediction of prototypes of new structural alphabets. Several structural alpha- bets, which differ in the length of prototypes and the number of prototypes, have been constructed and analyzed. Fragments of the large number of proteins, whose structure is experimentally determined, were used to construct the new structural alphabets. URI: http://hdl.handle.net/123456789/5236 Dateien zu dieser Ressource: 1
dthesis.Matf.Mirjana.Maljkovic.pdf ( 47.43Mb ) -
Radojčić, Nina (Beograd , 2018)[more][less]
Zusammenfassung: In this dissertation, three NP-hard optimization problems are studied and va- rious computational intelligence methods are considered for solving them, with a special emphasis on the possibilities of applying fuzzy logic in order to improve the performances of proposed methods. In addition, it is shown how fuzzy logic can be incorporated into a model to make it more adequate to real world applications. The first problem considered is the Risk-Constrained Cash-in-Transit Vehicle Routing Problem (RCTVRP) that represents a special case of the vehicle routing problem (VRP). Similar to the classical VRP, the aim is to determine the collection routes from one depot to a number of customers in order to minimize the overall travel distance (or cost). Additionally, the safety aspect of the routed risk constraints are introduced in the case of RCTVRP. The RCTVRP concerns the issue of secu- rity during the transportation of cash or valuable goods (e.g. in the cash-in-transit industry). The other two problems studied in this dissertation belong to the class of loca- tion problems: the Load Balancing Problem (LOBA) and the Max-Min Diversity Problem (MMDP). The goal of the LOBA problem is to locate a fixed number of facilities, such that the difference between the maximum and minimum number of customers served by each facility is balanced. The LOBA model is useful in cases where customers naturally choose the closest facility. The MMDP consists of se- lecting a subset of a fixed number of elements from a given set in such a way that the diversity among the selected elements is maximized. This problem also arises in real world situations encompassing a variety of fields, particularly the social and biological sciences. In order to solve the RCTVRP, a fuzzy GRASP (Greedy Randomized Adaptive Search Procedure) is hybridized with Path Reliking (PR) methodology. Carefully adjusted fuzzy modification incorporated into the proposed GRASP for the RC- TVRP improved its performance. Moreover, in this dissertation a new PR structure is implemented and can be used for other vehicle routing problems. To improve the algorithm’s time complexity, a new data structure for the RCTVRP is incor- porated. The proposed fuzzy GRASP with PR hybrid shows better computational performance compared to its non-fuzzy version. Furthermore, computational results on publicly available data sets indicate that the proposed algorithm outperforms all existing methods from the literature for solving the RCTVRP. For solving the LOBA problem two efficient hybrid metaheuristic methods are proposed: a combination of reduced and standard variable neighborhood search met- hods (RVNS-VNS) and hybridization of evolutionary algorithm and VNS approach (EA-VNS). The proposed hybrid methods are first benchmarked and compared to all the other methods on existing test instances for the LOBA problem with up to 100 customers and potential suppliers. In order to test the effectiveness of the pro- posed methods, we modify several large-scale instances from the literature with up to 402 customers and potential suppliers. Exhaustive computational experiments show that the proposed hybrid methods quickly reach all known optimal solutions while providing solutions on large-scale problem instances in short CPU times. Re- garding solution quality and running times, we conclude that the proposed EA-VNS approach outperforms other considered methods for solving the LOBA problem. EA approach is also proposed for solving the MMDP. Computational experi- ments on a smaller benchmark data set showed that the classic EA quickly reached all optimal solutions obtained previously by an exact solver. However, some of the larger instances of MMDP were challenging for classic EA. Although researchers have established the most commonly used parameter setting for EA that has good performance for most of the problems, it is still challenging to choose the adequate values for the parameters of the algorithm. One approach to overcome this is chan- ging parameter values during the algorithm run. As part of this dissertation this problem was addressed by extending the evolutionary algorithm by adding a fuzzy rule formulated from EA experts’ knowledge and experience. The implemented fuzzy rule changes the mutation parameter during the algorithm run. The results on tested instances indicate that the proposed fuzzy approach is more suitable for solving the MMDP than classic EA. For all three problems addressed whereas the smaller instances that CPLEX was able to solve, obtained optimal solutions were used for comparison with proposed methods and all of the proposed methods obtained these optimal solutions. Moreover, in this dissertation it has been shown that fuzzy logic is a successful tool in modeling the RCTVRP. In this problem the risk constraints are set by using a risk threshold T on each route and thus, the routes with risk larger than T are forbidden. However, in this dissertation the aim is to take into account the probability of being robbed along each route instead of just allowing solutions with routes that satisfy the risk constraints. A new fuzzy model for the RCTVRP is developed which takes into account the value of the risk index of each route and the solutions with lower values of risk indexes on their routes are considered superior. In order to achieve that fuzzy numbers are used in the improved model. Moreover, two mixed integer program formulations of new fuzzy model are developed and presented in this dissertation. The introduced fuzzy model is compared with the model from the literature using an adequate example and the advantages of the newly proposed fuzzy RCTVRP is demonstrated. Computational experiments are performed and the comparison of the two models given in the paper show that the newly presented approach leads to safer routes. URI: http://hdl.handle.net/123456789/4737 Dateien zu dieser Ressource: 1
tezaNinaRadojicic.pdf ( 1.665Mb ) -
Kartelj, Aleksandar (Beograd , 2014)[more][less]
Zusammenfassung: This work investigates the potential of improving the classi cation process through solving three classi cation-related problems: feature selection, feature weighting and parameter selection. All three problems are challenging and currently in the focus of scienti c researches in the eld of machine learning. Each problem is solved by using population-based metaheruistic method called electromagnetismlike method. This method is used for combinatorial and global optimization. It is inspired by laws of attraction and repulsion among charged particles. Each particle is represented by a vector of real values. The solution of the problem of interest is then obtained by mapping these real-valued vectors to the feasible solution domain. Particles representing better solutions achieve higher level of charge, which consequently produces greater impact on other particles. The search process is performed by iterating the particle movement, induced by charges. Through implementing the methods, two key aspects are managed: 1) the classi cation quality obtained after applying the optimization method and 2) the e ciency of the proposed methods from the perspective of time and space resources. All methods are equiped with problem-speci c local search procedures which tend to increase the solution quality. The bene t of applying feature selection for the classi cation process is twofold. Firstly, the elimination of unnecessary features decreases the data set noise, which degrades the quality of the classi cation model. Secondly, the problem dimension is decreased, thus the e ciency is increased. Feature selection problem is very e - ciently solved by the proposed method. The classi cation quality is in the majority of cases (instances) improved relative to the methods from literature. For some of the instances, computational times are up to several hundred times smaller than those of the competing methods. Feature weighting and parameter selection problem share similar underlying solution representation, based on the vectors of real values. Since the representation of charged particles is based on the same underlying domain, the transition from the particle to the solution domain behaves smoothly. The quality of the method for iv feature weighting is demonstrated through nearest neighbors classi cation model. The testing of the method is conducted on di erent collection of instances, and after that, the comparison to several methods from literature is made. In the majority of cases, the proposed method outperformed the comparison methods. The parameter selection, in classi cation, has a great impact on the classi cation quality. The proposed method for parameter selection is applied on the support vector machihe, which has a complex parametric structure when the number of parameters and the size of their domains is in question. By using heuristic initialization procedure, the detection of high quality regions for parameter combinations is accelerated. Exhaustive tests are performed on various instances in terms of their dimension and feature structure: homogenous and heterogeneous. Single kernel learning is adopted for homogenous, and multiple kernel learning for heterogeneous instances. The comparison with methods from literature showed superiority of the proposed method when single and multiple kernel learning based on radial basis function is considered. The method shows to be competitive in other cases. All proposed methods improved the classi cation quality. Because of the way, the problem is being solved, all three methods can be generalized and applied to a wide class of classi cation models and/or classi cation problem. URI: http://hdl.handle.net/123456789/4234 Dateien zu dieser Ressource: 1
phdAleksandarKartelj.pdf ( 2.121Mb ) -
Grbić, Milana (Beograd , 2020)[more][less]
Zusammenfassung: In this dissertation some actual problems of bioinformatics and computational biology are explored,together with the methods for solving them. The following problems are considered: partitioning ofsparse biological networks intok-plexsubnetworks, prediction of the role of metabolites in metabolicreactions, partitioning of biological networks into highly connectedcomponents and the problem ofidentification of significant groups of proteins by adding new edges to the weighted protein interacti-ons network. The aforementioned problems have theoretical importance in areas of machine learningand optimization, and practical application in biological research. Inaddition to solving the afore-mentioned problems from the computational aspect, the dissertation explores further application ofthe obtained results in the fields of biology and biochemistry, as well as the integration of resultswithin existing bioinformatics tools.The problem of predicting the role of metabolites in metabolic reactions is solved by a predictivemachine learning method based on the conditional random fields, whilefor the remaining threeproblems the algorithams based on variable neighbourhood search are developed. For solving theproblem of identification of significant groups of proteins by adding new edges to the weighted proteininteractions network, the variable neighbourhood search is only the first phase of the proposedsolution, while in the second and the third phase of the proposed method, the integration withadditional biological information and bioinformatics tools are performed.The proposed computational methods of partitioning and groupingin biological networks confirmexisting findings in a new manner and lead to new discoveries about biological elements and theconnections between them. By solving these problems and by interpreting the obtained resultsin this dissertation, a scientific contribution was made to the scientific field of computer science,particularly to the scientific disciplines of bioinformatics and computational biology. URI: http://hdl.handle.net/123456789/5088 Dateien zu dieser Ressource: 1
grbic_Milana_disertacija.pdf ( 8.740Mb ) -
Jovanović, Jasmina (Beograd , 2022)[more][less]
Zusammenfassung: The analysis of biological sequence similarity between different species is significant in identifying functional, structural or evolutionary relationships among the species. Biological sequence similarity and analysis of newly discovered nucleotide and amino acid sequences are demanding tasks in bioinformatics. As biological data is growing exponentially, new and innovative algorithms are needed to be constantly developed to get faster and more effective data processing. The challenge in sequence similarity analysis algorithms is that sequence does not always have obvious features and the dimension of sequence features may be very high for applying regular feature selection methods on sequences. It is important to have a simple and effective algorithm for determining biological sequence relationships. This thesis proposes two new methods for sequence transformation in feature vectors that takes into consideration statistically significant repetitive parts of analyzed sequences, as well as includes different approaches for determination of nucleotide sequence similarity and sequence classification for predicting taxonomy groups of biological sequence data. The first method is based on information theory and fact that both position and frequency of repeated sequences are not expected to occur with the identical presence in a random sequence of the same length. The second method includes building signatures of biological sequences and profiles of taxonomic classes based on repetitive parts of sequences and distances between these repeats. Proposed methods have been validated on multiple data sets and compared with results obtained using different well known and accepted methods in this field like BLAST, Clustal Omega and methods based on k-mers. Resulted precision for proposed methods is close to values provided for existing methods for the majority of tested data-sets, and time performance depends strictly to used infrastructure and sequence type. Methods provide results that are comparable with other commonly used methods focused on resolving the same problem, taking into consideration statistically significant repetitive parts of sequences with different characteristics. URI: http://hdl.handle.net/123456789/5440 Dateien zu dieser Ressource: 1
JasminaJovanovic.pdf ( 3.984Mb ) -
Perović, Vladimir (Beograd , 2013)[more][less]
Zusammenfassung: Although long-range intermolecular interactions (interactions acting on distances >5Å) play an important role in recognition and targeting between molecules in biological systems, there is no one appropriate software package allowing use of this important property in investigation of biologically active molecules. The multifunctional EIIP/ISM software, which is based on physical parameters determining long-range molecular properties, was developed in this thesis. This novel and unique platform allows (i) investigation of protein-protein and protein-small molecule interactions, (ii) analysis of structure/function relationship of proteins, (iii) assessment of biological effects of mutations in proteins, (iv) monitoring of the functional evolution of proteins, (v) ―de novo‖ design of molecules with desired biological function and (vi) selection of candidate therapeutic molecules. Results of application of the EIIP/ISM platform on diverse problems (e.g. the evolution of influenza A viruses, assessment of biological effects of mutations on the LPL protein, representing a risk factor for cardiovascular diseases, identification of therapeutic targets for HIV and influenza viruses, virtual screening of molecular libraries for candidate antibiotics and anti-HIV drugs) which are presented in this thesis, confirm the applicability of this platform on broad spectrum of problems in molecular biology, biomedicine and pharmacology. URI: http://hdl.handle.net/123456789/4230 Dateien zu dieser Ressource: 1
phdPerovic_Vladimir.pdf ( 11.95Mb ) -
Đenić, Aleksandar (Beograd , 2018)[more][less]
Zusammenfassung: This pap er considers two discrete lo cation problems: Bus Terminal Lo cation Problem (BTLP) and Long-term Care Facility Lo cation Problem (LTCFLP). Vari- able Neighb orho o d Search (VNS) metho d for solving BTLP and LTCFLP is pre- sented in this pap er. VNS is a single-solution based metaheuristic based on system- atic change of neighb orho o ds while searching for optimal solution of the problem. It consists two main phases: shake phase and lo cal search phase. BTLP is a discrete lo cation problem which considers lo cating bus terminals in order to provide the highest p ossible quality of public service to the clients. Clients are presented as public transp ortation stations, such as bus or metro stations. VNS algorithm is used for solving BTLP. This algorithm uses improved lo cal search based on e cient neighb orho o d interchange. VNS is parallelized (PVNS) which leads to signi cant time improvement in function of the pro cessor core count. Computa- tional results show that prop osed PVNS metho d improves existing results from the literature in terms of quality. Larger instances, based on instances from the Trav- eling Salesman Problem library, are presented and computational results for those instances are rep orted. LTCFLP is created as a part of health care infrastructure planning in South Korea. Clients are considered as groups of patients with a need of long-term health care, while established facilities present lo cations where the centers that provide health care services should b e built. Prede ned are n lo cations where centers are to b e established. This problem seeks at most K lo cations to establish health centers so they are to b e equally loaded with clients demand. For solving LTCFLP, by using VNS algorithm, data structure based on fast interchange is presented. It reduces the time complexity of one iteration of lo cal search algorithm to O ( n · max( n,K 2 )) comparing to the known time complexity from the literature O ( K 2 · n 2 ) . Reduced time complexity of the presented VNS leads to b etter quality solutions, due to larger numb er of VNS iterations that can b e p erformed in less computational time. This pap er presents computational results that outp erform the b est known results from the literature. URI: http://hdl.handle.net/123456789/4744 Dateien zu dieser Ressource: 1
Aleksandar_Djenic_phd.pdf ( 2.183Mb ) -
Mišković, Stefan (Beograd , 2016)[more][less]
Zusammenfassung: In this dissertation, three NP-hard min-max discrete optimization problems are considered. The rst considered problem is multi-period emergency service location problem, the second one is dynamic maximal covering location problem with multiple covering radii, and the third one is uncapacitated multiple allocation p-hub center problem. In many practical situations, input parameters (such as user demands, transportation time or cost) often vary with unknown distributions. Therefore, it is necessary to involve these uncertainties in the deterministic variants of the problems by applying robust optimization approach. Mathematical models for the deterministic and non-deterministic variants of all three problems are developed, except for the deterministic uncapacitated multiple allocation p-hub center problem, which has already been addressed in the literature. In addition, for the rst time in the literature, it was proven that the emergency service location problem is NP-hard. The considered problems and their robust variants have numerous applications, due to the fact that in real-life situations input parameters are often subject to uncertainty. Multi-period emergency service location problem may be used when determining optimal locations for police stations, re brigades, ambulances, and other emergency units in the given region. The dynamic maximal covering location problem with multiple covering radii is useful when choosing the optimal strategy for establishing resources (service centers, suppliers, facilities, etc.) with maximal satisfaction of customer demands in a certain region, by assuming that the service e ciency directly depends on the distance between customer and service center (i.e., the selected coverage radius). The uncapacitated multiple allocation p-hub center problem has signi cant applications in designing telecommunication and transportation networks, postal delivery systems, emergency systems, supply networks, etc. Since exact methods provide optimal solutions only for problem instances of small dimensions, hybrid metaheuristic algorithms are developed to solve both deterministic and robust variants of the considered problems. The proposed hybrid algorithms are obtained by combining particle swarm optimization, with local search heuristic { classical local search or variable neighborhood search method. For dynamic maximal covering location problem with multiple covering radii, a hybridization of metaheuristic algorithm with exact method based on linear programming is developed. All elements of the proposed algorithms are adopted to the problems under consideration. Di erent strategies are implemented for improving the e ciency of proposed algorithms, especially for the calculation of the objective function value and the local search part. The in uence of di erent parameters of hybrid algorithms on the solution quality is analyzed in detail. All parameters are adjusted by using analysis of variance. For all considered problems (both deterministic and robust variant), the performance of the proposed hybrid algorithms is evaluated on adequate test data sets. The proposed algorithms are compared with existing heuristic from the literature and exact methods incorporated in commercial CPLEX solver. The obtained experimental results indicate the e ciency of proposed algorithms in obtaining high quality solutions for all considered test instances. The presented comparative analysis indicates the advantages of the proposed hybrid algorithms over existing methods in the sense of solution quality and/or required computational time, especially in the case of large problem dimensions. The results presented in this paper represent a contribution to the eld of discrete optimization, robust optimization and metaheuristic methods. URI: http://hdl.handle.net/123456789/4423 Dateien zu dieser Ressource: 1
Miskovic_Stefan_teza.pdf ( 1.773Mb ) -
Stojadinović, Mirko (Beograd , 2016)[more][less]
Zusammenfassung: Many real-world problems can be modeled as constraint satisfaction problems (CSPs) and then solved by one of many available techniques for solving these problems. One of the techniques is reduction to SAT, i.e. Boolean Satisfiability Problem. Variables and constraints of CSP are translated (encoded) to SAT instance, that is then solved by state-of-the-art SAT solvers and solution, if exists, is translated to the solution of the original CSP. The main aim of this thesis is to improve CSP solving techniques that are using reduction to SAT. Two new hybrid encodings of CSPs to SAT are presented and they combine good sides of the existing encodings. We give the proof of correctness of one encoding that did not exist in literature. We developed system meSAT that enables reduction of CSPs to SAT by using 4 basic and 2 hybrid encodings. The system also enables solving of CSPs by reduction to two problems related to SAT, SMT and PB. We developed a portfolio for automated selection of encoding/solver to be used on some new instance that needs to be solved. The developed portfolio is comparable with the state-of-the-art portfolios. We developed a hybrid approach based on short solving timeouts with the aim of significantly reducing the preparation time of a portfolio. By using this approach, we got results comparable to the ones obtained by using preparation time of usual length. We made comparison between several machine learning techniques with the aim to find out which one is the best suited for the short training approach. The problem of assigning air traffic controllers to shifts is described and three models of this problem are presented. We used a large number of different solving methods and a diverse set of solvers for solving this problem. We developed optimization techniques that aim to find optimal solutions of the problem. A hybrid technique combining reduction to SAT and local search is shown to be the most efficient one. We also considered sudoku puzzles and the existing techniques of solving the puzzles of greater size than 9 9. Amongst the used techniques, the existing reduction to SAT is the most efficient in solving these puzzles. We improved the existing algorithm for generating large sudoku puzzles. It is shown that simple preprocessing rules additionally improve speed of generating large sudokus. URI: http://hdl.handle.net/123456789/4427 Dateien zu dieser Ressource: 1
MirkoStojadinovicTeza.pdf ( 2.030Mb ) -
Kovačević, Jovana (Beograd , 2015)[more][less]
Zusammenfassung: Proteins represent the most important groups of biomoleculs. Di erent functions that they carry out in each organism are unique and irreplaceable, including versatile cellular processes, structural role of proteins, catalytic function, a number of metabolic functions and so on. Knowing and under- standing protein function is therefore essential in investigation of any biolo- gical process, especially of human diseases since a lot of them are caused by functional mutations. In this paper, we represent investigation of protein function domain through two di erent approaches. In the rst one, protein function is represented by GO ontologies with the structure of a directed acyclic graph. There are three GO ontologies: one for functions regarding biological processes, one for functions regarding cellular components and one for molecular functions. Each ontology contains several thousands of nodes, where every node deter- mines more speci c function than his ascendants. The task of this part of research was to develop a software for predicting protein function from its primary sequence based on structural support vector machines method which represents generalization of well-known support vector machines method on structural output. Structure-function paradigm is one of basic concepts in molecular biology, stating that 3D proten structure is closely connected to its role in organism. It has been detected that disordered proteins (the ones that lack 3D struc- ture) and disordered regions of proteins are related with severe contemporary illnesses, which contributed to their popularity in modern research. In an- other aspect, we investigated the relationship between proteins' functional categories and their disorder, as well ad with other physico-chemical char- acteristics of proteins. Here, protein function has been observed through 25 elementary functions grouped in 4 functional groups. In this work, we present results of thorough analysis over large protein dataset where dis- order has been determined computationally, using publicly available tools. URI: http://hdl.handle.net/123456789/4451 Dateien zu dieser Ressource: 1
DoktoratJK2015.pdf ( 1.116Mb ) -
Bačanin Džakula, Nebojša (Beograd , 2015)[more][less]
Zusammenfassung: Hard optimization problems that cannot be solved within acceptable computational time by deterministic mathematical methods have been successfully solved in recent years by population-based stochastic metaheuristics, among which swarm intelligence algorithms represent a prominent class. This thesis investigates improvements of the swarm intelligence metaheuristics by hybridization. During analysis of the existing swarm intelligence metaheuristics in some cases de ciencies and weaknesses in the solution space search mechanisms were observed, primarily as a consequence of the mathematical model that simulates natural process as well as inappropriate balance between intensi cation and diversi cation. The thesis examines whether existing swarm intelligence algorithms for global optimization could be improved (in the sense of obtaining better results, faster convergence, better robustness) by hybridization with other algorithms. A number of hybridized swarm intelligence metaheuristics were developed and implemented. Considering the fact that good hybrids are not created as a random combination of individual functional elements and procedures from di erent algorithms, but rather established on comprehensive analysis of the functional principles of the algorithms that are used in the process of hybridization, development of the hybrid approaches was preceded by thorough research of advantages and disadvantages of each involved algorithm in order to determine the best combination that neutralizes disadvantages of one approach by incorporating the strengths of the other. Developed hybrid approaches were veri ed by testing on standard benchmark sets for global optimization, with and without constraints, as well as on well-known practical problems. Comparative analysis with the state-of-the-art algorithms from the literature demonstrated quality of the developed hybrids and con rmed the hypothesis that swarm intelligence algorithms can be successfully improved by hybridization. URI: http://hdl.handle.net/123456789/4245 Dateien zu dieser Ressource: 1
phdBacaninNebojsa.pdf ( 3.813Mb ) -
Alatrash, Emhimed Salem (Beograd , 2015)[more][less]
Zusammenfassung: Ontologies, often defined as an explicit specification of conceptualization, are necessary for knowledge representation and knowledge exchange. This means that ontology describes concepts and relations that exist in a domain. To enable knowledge exchange, it is necessary to describe these concepts and relations in a better way than just ordering them in taxonomy. A computational ontology consists of a number of different components, such as Concepts, Instances, Individuals or Facts, Relations and Attributes. The present research is intended to consider different software tools related to Semantic web, and achieve a kind of comparison among them. In fact, five ontology-editors are described and compared. They are: Apollo, Onto Studio, Protégé, Swoop and TopBraid Composer Free Edition. The structure and basic features of these editors as well as the way of using them are described. The main criterion used in the process of comparing these editors lies in their convenience for the user, and the possibility to apply them in different kinds of application. The main goal of the work is to introduce a method for ontology construction of a certain domain in applying the Semantic web. A number of software tools adapted to build up the domain ontologies of most wide–spread natural languages are available; however accomplishing that for any given natural language presents a challenge. This research proposes a semi-automatic procedure to create ontologies for different natural languages. The approach utilizes various software tools that are available on the Internet, most notably DODDLE-OWL which is a domain ontology development tool implemented for English and Japanese languages. Through this tool, WordNet, Protégé and XSLT transformations, the researcher proposes a general procedure to construct domain ontology for any natural language. URI: http://hdl.handle.net/123456789/4266 Dateien zu dieser Ressource: 1
phdEmhimedAlatrash.pdf ( 2.171Mb ) -
Nikolić, Mladen (Belgrade , 2013)[more][less]
Zusammenfassung: In this thesis the problem of guiding search in automated theorem proving is considered. The thesis consists of two parts that have the CDCL search system, the system intensively used by modern SAT solvers, as their common topic. In the rst part of the thesis a simple approach to guiding search is considered | guiding by the selection of the solver, its heuristics, and their parameters, based on the properties of an instance to be solved. The basis of the proposed methods for algorithm selection is syntactical similarity of formulae which is re ected in their graph structure. This graph similarity is established and analyzed by using an original graph similarity measure (which turned out to be useful in other contexts, too). Yet, practical approaches to measuring similarity of formulae are based on their numerical features due to the computational complexity issues. Two simple methods for algorithm selection, based on k nearest neighbors, were proposed. The rst technique, ArgoSmArT is based on classi cation of instance in one of the prede ned families for which the e cient algorithms are known. The instance is solved by algorithm corresponding to the family to which the instance was classi ed. The second technique, ArgoSmArT k-NN is based on nding several similar instances in the training set for which the solving times by all considered algorithms are known. The instance is solved by the algorithm that behaves the best on those instances. ArgoSmArT technique is better suited for con guration selection of a SAT solver, and ArgoSmArT k-NN for SAT solver selection. ArgoSmArT k-NN technique showed to be more e cient than the most important and very complex system for SAT solver selection | SATzilla system. Apart from CNF SAT solver selection, the problem of non-CNF SAT solver selection is considered. The focus was not on solver selection techniques, since the proposed techniques are directly applicable, but on the attributes that can be used to describe non-CNF SAT instances, which have not been proposed earlier. The results in this domain are positive, but still limited. The main reason for that is the lack of greater number of non-CNF SAT solver of di erent behaviour, which is not surprising, having in mind that this kind of solvers is in its early stage of development. Apart from construction of e cient SAT solver selection system, the methodology of SAT solver comparison, based on statistical hypothesis testing is proposed. The need for such a methodology comes from great run time variations of single instance solving by a solver, which can result in di erent SAT solver orderings when one tries to compare their performance or rank them, as experimentally demonstrated. The proposed methodology gives the estimate of statistical signi cance of the performed test and the estimate of the e ect size, for instance the probability of a solver being faster than another. The second part of the thesis is concerned with generalizing CDCL search system to fragments of rst order logic. The proposed system can be used as a basis for e cient proving in some fragment if the rules of resolution and factoring are speci ed for that fragment. These rules are de ned for an extension of coherent logic. The soundness and completeness of the system are proved. The system has several distinguishing features which are a consequence of previously performed analysis of challenges in coherent logic theorem proving. The system enables rst order reasoning, instead of ground one characteristic for all existing coherent logic provers. Moreover, it introduces backjumps and lemma learning. The special attention in system design was paid to the possibility of generating readable proofs by the prover implementing the system. This possibility is one of the greatest qualities of coherent logic, but it is not easy to achieve if CDCL search system is used. One of the properties of the system that came from the need for generation of readable proofs is preservation of quanti ers in proving process which is rather unusual for existing CDCL systems. Another advantage of the proposed CDCL system is the possibility of transfer of heuristics which are already successfully employed in SAT solving to other domains. Based on the proposed system, the proof procedure Calypso for extended coherent logic was de ned which can also be used in standard coherent logic. The extension of Rete algorithm which enables detection of con icts and literals to be propagated or decided is proposed. Procedure Claypso is implemented in C++. It was evaluated on a representative coherent logic problems and it showed superior to other coherent logic provers and also the prover Vampire, the most e cient prover for rst order logic. Based on the results presented in this thesis, it can be concluded that the main hypothesis of this work, that the search system used in CDCL SAT solvers can be signi cantly improved by simple guiding and that it can be successfully formulated for fragments of rst order logic such as coherent logic, was con rmed and that the concrete answers on how to do that were provided. URI: http://hdl.handle.net/123456789/2584 Dateien zu dieser Ressource: 1
nikolic_mladen.pdf ( 1.448Mb ) -
Dimovski, Igor (Novi Sad , 2011)[more][less]
Zusammenfassung: A comprehensive pedagogical research regarding teaching mathematics at a tertiary, university level has been presented in the PhD dissertation. The educational resources tailored in an electronic form using the programme package Matlab are integrated in the learning process. The impact of ICT use to the essential knowledge that refers to multivariate calculus (functions of several variables, vector-valued functions and the three-dimensional analytical geometry) has been statistically explored by intensive use of 3D static and dynamic visual tools. Part of the students who have participated in the research have developed Matlab programmes all by their own. One part of the research has been focused on probable impact of the programming skills on learning mathematical concepts. URI: http://hdl.handle.net/123456789/3874 Dateien zu dieser Ressource: 1
PhD_I_Dimovski.pdf ( 5.423Mb )