Search
Author
- Osman, Ahmed I. (5)
- Daqing, Ma (3)
- Jorgensen, Ed (3)
- Li, Yan (3)
- next >
Subject
- kinh tế (26)
- Economics (12)
- programming (10)
- XRD (10)
- next >
Date issued
- 2020 - 2025 (2128)
- 2010 - 2019 (129)
- 2000 - 2009 (9)
- 1999 - 1999 (1)
Has File(s)
Search Results
In modern computing architectures, graph theory is the soul of the play due to the rising core counts. It is indispensable to keep finding a better way to connect the cores. A novel chordal-ring interconnect topology system, Equality, is revisited in this paper to compare with a few previous works. This paper details the procedures for constructing the Equality interconnects, its special routing procedures, the strategies for selecting a configuration, and evaluating its performance using the open-source cycle-accurate BookSim package. Four scenarios representing small- to large-scale computing facilities are presented to assess the network performance. This work shows that in 16,384-endpoint systems, the Equality network turns out to be the most efficient system. The results also s... |
Driving the decisions of stock market investors is among the most challenging financial research problems. Markowitz’s approach to portfolio selection models stock profitability and risk level through a mean–variance model, which involves estimating a very large number of parameters. In addition to requiring considerable computational effort, this raises serious concerns about the reliability of the model in real-world scenarios. This paper presents a hybrid approach that combines itemset extraction with portfolio selection. We propose to adapt Markowitz’s model logic to deal with sets of candidate portfolios rather than with single stocks. We overcome some of the known issues of the Markovitz model as follows: (i) Complexity: we reduce the model complexity, in terms of parameter es... |
The determination of Lagrangian Coherent Structures (LCS) is becoming very important in several disciplines, including cardiovascular engineering, aerodynamics, and geophysical fluid dynamics. From the computational point of view, the extraction of LCS consists of two main steps: The flowmap computation and the resolution of Finite Time Lyapunov Exponents (FTLE). In this work, we focus on the design, implementation, and parallelization of the FTLE resolution. We offer an in-depth analysis of this procedure, as well as an open source C implementation (UVaFTLE) parallelized using OpenMP directives to attain a fair parallel efficiency in shared-memory environments. We have also implemented CUDA kernels that allow UVaFTLE to leverage as many NVIDIA GPU devices as desired in order to rea... |
The management of shared resources in multicore processors is an open problem due to the continuous evolution of these systems. The trend toward increasing the number of cores and organizing them in clusters sets out new challenges not considered in previous works. In this paper, we characterize the use of the shared cache and memory bandwidth of an AMD Rome processor executing multiprogrammed workloads and propose several mechanisms that control the use of these resources to improve the system performance and fairness. Our control mechanisms require no hardware or operating system modifications. We evaluate Balancer on a real system running SPEC CPU2006 and CPU2017 applications. Balancer tuned for performance shows an average increase of 7.1% in system performance and an unfairness... |
Prediction and classification of diseases are essential in medical science, as it attempts to immune the spread of the disease and discover the infected regions from the early stages. Machine learning (ML) approaches are commonly used for predicting and classifying diseases that are precisely utilized as an efficient tool for doctors and specialists. This paper proposes a prediction framework based on ML approaches to predict Hepatitis C Virus among healthcare workers in Egypt. We utilized real-world data from the National Liver Institute, founded at Menoufiya University (Menoufiya, Egypt). The collected dataset consists of 859 patients with 12 different features. To ensure the robustness and reliability of the proposed framework, we performed two scenarios: the first without featur... |
We introduce the R everse S patial Top-k K eyword (RSK) query, which is defined as: given a query term q, an integer k and a neighborhood size find all the neighborhoods of that size where q is in the top-k most frequent terms among the social posts in those neighborhoods. An obvious approach would be to partition the dataset with a uniform grid structure of a given cell size and identify the cells where this term is in the top-k most frequent keywords. However, this answer would be incomplete since it only checks for neighborhoods that are perfectly aligned with the grid. Furthermore, for every neighborhood (square) that is an answer, we can define infinitely more result neighborhoods by minimally shifting the square without including more posts in it. To address that, we need to i... |
Cryptographic constructions based on hard lattice problems have emerged as a front runner for the standardization of post-quantum public-key cryptography. As the standardization process takes place, optimizing specific parts of proposed schemes, e.g., Bernoulli sampling and integer Gaussian sampling, becomes a worthwhile endeavor. In this work, we propose a novel Bernoulli sampler based on polar codes, dubbed “polar sampler”. The polar sampler is information theoretically optimum in the sense that the number of uniformly random bits it consumes approaches the entropy bound asymptotically. |
Iterative stencil computations are widely used in numerical simulations. They present a high degree of parallelism, high locality and mostly-coalesced memory access patterns. Therefore, GPUs are good candidates to speed up their computation. However, the development of stencil programs that can work with huge grids in distributed systems with multiple GPUs is not straightforward, since it requires solving problems related to the partition of the grid across nodes and devices, and the synchronization and data movement across remote GPUs. In this work, we present EPSILOD, a high-productivity parallel programming skeleton for iterative stencil computations on distributed multi-GPUs, of the same or different vendors that supports any type of n-dimensional geometric stencils of any order... |
Motion Estimation is one of the main tasks behind any video encoder. It is a computationally costly task; therefore, it is usually delegated to specific or reconfigurable hardware, such as FPGAs. Over the years, multiple FPGA implementations have been developed, mainly using hardware description languages such as Verilog or VHDL. Since programming using hardware description languages is a complex task, it is desirable to use higher-level languages to develop FPGA applications.The aim of this work is to evaluate OpenCL, in terms of expressiveness, as a tool for developing this kind of FPGA applications. To do so, we present and evaluate a parallel implementation of the Block Matching Motion Estimation process using OpenCL for Intel FPGAs, usable and tested on an Intel Stratix 10 FPGA... |
The perception of realism in computer-generated images can be significantly enhanced by subtle visual cues. Among those, one can highlight the presence of dust on synthetic objects, which is often subject to temporal variations in real settings. In this paper, we present a framework for the generation of textures representing the accumulation of this ubiquitous material over time in indoor settings. It employs a physically inspired approach to portray the effects of different levels of accumulated dust roughness on the appearance of substrate surfaces and to modulate these effects according to the different illumination and viewing geometries. The development of its core algorithms was guided by empirical insights and data obtained from observational experiments which are also descr... |