Computer Science

Browse by

DIRECT DATA-SNAPSHOTTING AND SNAPSHOT SHARING ACROSS CLOUD-NATIVE APPLICATIONS

Ristović, Ivan (Beograd , 2026)

[more]

Abstract:

Cloud-computing platforms provide services to consumers through multiple serviceoffering models. Recent advances in these models have led to the emergence of serverless computing, or simply serverless, where infrastructure is managed by the service provider. Serverless is usually coupled with function-based programming model in which software systems are composed of reusable, lightweight units of code executed within isolated sandboxed environments. Major cloud-computing platforms, including Amazon Web Services (AWS), Microsoft Azure, and Google Cloud, report that a substantial proportion of their customers employ serverless solutions. Most cloud-computing providers employ a pay-as-you-go billing model. Inefficient utilization of computing resources, particularly CPU time and working memory, which constitute the most costly resources, leads to increased overall operational costs. Moreover, the requirement for resource isolation adversely affects initialization latency and results in additional CPU and working-memory overhead. Serverless sandboxes are typically deployed on top of heavyweight virtualization stacks that includeJava, JavaScript, or Python runtime environments with accompanying frameworks, further increasing working-memory consumption. Modern cloud-computing architectures use Checkpoint/Restore (abbr. c/r) techniques to freeze initialized sandboxes into a continuable form. Such techniques, in combination with cloud-native deployments, allow the virtualized environment to optimize resource consumption and share code and pre-initialized data across multiple sandboxes. However, such solutions either operate at application-build time to support data pre-initialization or sharing, or operate at execution time with limited sharing potential for data available during application execution. Such data is processed multiple times and duplicated in each sandbox. This dissertation presents Doss, a direct object snapshotting and sharing system that performs data c/r during application execution. Doss persists data directly, without transformations, into reusable and shareable snapshots. Direct snapshotting allows Doss to achieve near-constant data deserialization time, greatly improving initialization times and reducing CPU usage. Doss architecture enables snapshot sharing across application instances, eliminating the excess memory footprint associated with data re-processing and duplication. GraalDoss, a Doss implementation for Java, is integrated into the GraalVM ecosystem. GraalDoss is evaluated using 106 correctness and robustness tests and a novel set of cloudnative micro and macro benchmarks that exercise real-world scenarios. A comprehensive evaluation of GraalDoss shows a consistent near-constant data-deserialization overhead with serialization times comparable to state-of-the-art Java JSON and binary serialization libraries. GraalDoss reduces the memory footprint of web API microservice caches by sharing populated cache snapshots across microservice instances, improving the overall density by 41% for 8 microservice instances and improving first-response times by 34%. In NLP applications, GraalDoss improves the pipeline execution times by six orders of magnitude by snapshotting pipeline results and subsequently loading the snapshots.

URI:

http://hdl.handle.net/123456789/5786

Files in this item: 1

IvanRistovic_PhD_Dissertation.pdf ( 4.683Mb )

Metode za efikasno rešavanje dominacijskih problema na velikim grafovima

Kapunac, Stefan (Beograd , 2026)

[more]

Abstract:

This dissertation addresses methods for efficiently solving several important variants of domination problems on graphs, with a particular focus on large-scale instances that frequ- ently appear in real-world systems. Domination problems have numerous applications in the analysis and management of complex networks, including social, telecommunication, transport, and biological networks. The study covers four problems: minimum weight total domination, minimum weight independent domination, k-strong Roman domination, and the canonical mi- nimum domination problem on large graphs. For the minimum weight total domination problem, a variable neighborhood search approach is proposed, with carefully designed mechanisms for shaking, local search, and fitness function evaluation. The results show that the proposed algorithm achieves optimal solutions on small and medium instances and outperforms competing approaches on large graphs. Additionally, an application of this problem for accelerating information spreading in social networks is proposed. For the minimum weight independent domination problem, two new integer linear pro- gramming models are developed. Solving these models finds optimal solutions for all smaller instances while demonstrating superior performance compared to competing exact approaches on larger graphs. In addition, a greedy heuristic is proposed that outperforms competing greedy methods on most instances. In the case of k-strong Roman domination, a greedy heuristic based on node coverage information is developed, along with a metaheuristic approach based on variable neighborhood search that uses the greedy algorithm for initialization. This problem is particularly challenging due to the exponential complexity of solution feasibility verification, leading to the introduction of the concept of quasi-feasibility that enables efficient feasibility assessment during the search. Experimental results show that the proposed algorithm consistently outperforms the greedy approach and existing competing methods, especially on larger graphs. The practical value of the algorithm is illustrated through a case study involving the optimal positioning of fire stations and vehicles in urban municipalities to ensure the entire city is safe in the event of k simultaneous fires. For the minimum domination problem, a new hybrid approach called IRIS is proposed. IRIS is designed as a general-purpose framework that bridges the gap between exact integer linear programming solvers and heuristic search by iteratively fixing selected variables to reduce the search space. Тhe novelty lies in its flexible subproblem construction mechanism, which can be tailored using various selection strategies. In this study, we implement and evaluate a specific configuration of IRIS that utilizes historical statistical data and a node-coverage-based heuristic to intelligently identify variables for fixing. This targeted approach allows the ILP solver to find high-quality solutions for large-scale instances that are computationally prohibitive for exact methods. Experimental results demonstrate that IRIS achieves competitive performance com- pared to the best existing methods, establishing it as a valid alternative for solving domination and potentially other NP-hard problems.

URI:

http://hdl.handle.net/123456789/5782

Files in this item: 1

phdStefanKapunac.pdf ( 3.299Mb )

PREDVIĐANJE PROFILA IZVRŠAVANJA PROGRAMA TEHNIKAMA MAŠINSKOG U ČENJA

Čugurović, Milan (Beograd , 2025)

[more]

Abstract:

Compilers use program profiles to perform profile-guided optimizations and pro- duce efficient programs. Although dynamic profilers generate high-quality profiles, they have significant drawbacks. They complicate the application build pipeline by requiring two compi- lation steps and an additional profile collection run. Dynamic profilers also consume substantial time and memory and place a heavy burden on developers to create suitable workloads that accurately reflect typical application usage, cover important code paths, and generate well- distributed profiles. In response to the shortcomings of dynamic profilers, modern static profilers employ ma- chine learning (ML) techniques to predict program profiles. However, state-of-the-art ML-based static profilers often rely on handcrafted features that are platform-specific and difficult to adapt across different architectures and programming languages. They also tend to use computation- ally intensive deep neural networks, which increase application compilation time. Moreover, ML-based static profilers can degrade the performance of optimized programs due to inaccurate profile predictions. This dissertation presents GraalSP , an ML-based static profiler that is portable, polyglot, efficient, and robust. GraalSP achieves portability by defining features on a high-level, graph- based intermediate representation and by partially automating the feature extraction process. This design makes GraalSP polyglot, allowing it to predict profiles for programs written in any language that compiles to Java bytecode, such as Java, Scala, or Kotlin. GraalSP is efficient due to its use of a lightweight XGBoost model based on decision trees, and robust because it relies on carefully designed heuristics that correct machine learning predictions and ensure high performance in programs optimized using the predicted profiles. We integrate GraalSP into the Enterprise GraalVM Native Image compiler and evaluate it on 28 benchmarks from the Renaissance, DaCapo, and DaCapo Scala benchmark suites. These suites represent a modern and diverse collection of benchmarks, featuring numerous real-world workloads across a variety of programming paradigms. Our comprehensive evaluation shows that GraalSP achieves a geometric mean speedup of 7.46% in execution time compared to the default compiler configuration, which models program profiles using a uniform distribution. This dissertation also presents a detailed qualitative and quantitative analysis to position and compare the proposed solution against state-of-the-art static profilers. Additionally, to enhance and expand the evaluation and support developers in analyzing GraalSP ’s predictions, this dissertation introduces the GraalSP-PLog tool. This tool allows developers to run the GraalSP static profiler on any program and generate detailed prediction reports, making it easier to inspect individual predictions and identify model mispredictions. Since GraalSP provides substantial performance gains, has minimal impact on binary size and compile time, and includes a modern, fully automated model retraining pipeline, it is well- suited for commercial deployment. As a result, GraalSP has been the default static profiler for the Enterprise GraalVM Native Image compiler since June 2023, consistently improving performance with every build.

URI:

http://hdl.handle.net/123456789/5778

Files in this item: 1

Milan_Cugurovic_doktorska_disertacija.pdf ( 5.694Mb )

Modelovanje moralnih i emocionalnih aspekata jezika u klasifikaciji konverzacionih tekstova

Šošić, Milena (Beograd , 2025)

[more]

Abstract:

Conversational text messages represent an important form of digital communication in modern society. With the development of information technologies, various communication tools have emerged, such as email, social media, instant messaging tools, and automated response systems. Messages generated within these tools, unlike standard texts, have a specific structure that allows for the classification of individual messages or sets of messages that form a conversation. Classification labels are defined by the specific task being addressed and can be either single-label or multi-label, which enables the recognition of complex interrelationships between the categories. Introducing moral and emotional dimensions of language into research is crucial for understanding the complex patterns of human communication, particularly in the context of digital platforms and social media. Machine learning (ML) methods, such as deep neural networks (DNN), facilitate the utilization and more precise recognition of these aspects while simultaneously providing an efficient way to classify emotions and moral values expressed in texts. The noticeable complexity in the expression of human emotions and moral values, which are often conveyed implicitly and depend heavily on context, makes their recognition particularly challenging. One of the major challenges is the lack of or limited availability of resources in terms of size and diversity for low-resource languages, including Serbian. The development of linguistic resources, such as annotated lexicons and corpora, plays a crucial role in this process by providing the necessary knowledge sources for building and improving existing ML models. Linguistic resources enable models to learn how different emotional expressions and moral values influence the tone and meaning of communication. To support this, a semantic lexicon for sentiment intensity, SentiWords.SR, containing approximately 15k words, was developed for the Serbian language, along with the associated tool SRPOL for measuring sentiment intensity in textual sequences in Serbian. Additionally, a semantic lexicon for emotional affect, EmoLex.SR, comprising around 9.8k words with assigned emotional intensity values, and a semantic lexicon for moral values, MFD.SR, consisting of approximately 4.3k words with associated moral value weights, were developed. Significant efforts were also made in annotating the first conversational corpora from social media with emotional and moral categories. In this regard, the Social-Emo.SR corpus (∼34.6k messages) was developed, consisting of the Twitter-Emo.SR subcorpus (∼16.7k messages) and the Reddit-Emo.SR subcorpus (∼17.9k messages), collected from Twitter and Reddit, respectively. Furthermore, by searching for key moral-related terms, a subset of messages expressing potential moral stances was extracted from Social-Emo.SR. This subset, named Social-Mor.SR (∼13.6k messages), was manually verified and annotated by human annotators and consists of the Twitter-Mor.SR subcorpus (∼6.1k Twitter messages) and the Reddit-Mor.SR subcorpus (∼7.5k Reddit messages). In the context of DNN architectures, models based on recurrent networks or transformers, trained on these resources, enable the recognition and utilization of emotional and moral aspects of language in various contexts. The combination of advanced algorithms, such as Bidirectional Long Short-Term Memory (BiLSTM) networks and the attention mechanism with linguistically and culturally adapted resources (Meta) opens new possibilities for analyzing moral and emotional aspects of language. This has broad applications in classification tasks such as recognizing personal context, truthfulness of posts, or types of engagement in digital communication. For personal context recognition, i.e. classifying corporate emails as either business-related or personal, results show that using a carefully designed hybrid approach (BiLSTM-Att+Meta) across entire conversation branches yields the best results, comparable to published benchmarks on the same task. In experiments related to rumor veracity classification and identifying engagement types in response to rumors, it was demonstrated that moral and emotional attributes derived from semantic lexicons (EmoAttr, MorAttr ⊆ Meta) improve classification accuracy by +4.2% and +3.8% respectively, compared to methods without these attributes. For emotion recognition in Serbian conversational texts, experiments revealed that transformer-based models fine-tuned on the task achieved F1-scores of approximately 53%, reaching performance levels reported for multi-label classification on the same emotional category set. Additionally, experiments showed that further data preprocessing and balancing improved model performance. In moral value and moral sentiment classification tasks, using the Social-Mor.SR corpus and its subcorpora, an F1-score of ∼46% was achieved for moral value recognition and ∼38% for moral sentiment recognition, indicating acceptable results but also the need for further model optimization. Fine-tuning LLaMA models yielded reasonable but slightly lower performance compared to BERT-based architectures. Since model performance is directly dependent on the data they are trained on, there is potential for further improvements by refining and balancing initial annotations in the utilized corpora.

URI:

http://hdl.handle.net/123456789/5774

Files in this item: 1

Doktorski_rad_Milena_Sosic.pdf ( 6.206Mb )

Computer Science

eLibrary

Computer Science

Browse by

Recent Submissions

Search eLibrary

Browse

All of eLibrary

This Collection

My Account

Relited sites

COPYRIGHT STATEMENT

Abstract:	The dissertation relates to the elements of medical decision-making, modeled by a consultative expert system, characteristic to the domain of rheumatology and potentially other domains of medicine with a similar structure.
URI:	http://hdl.handle.net/123456789/5764