LNCS Proceedings Papers
Petr Bauch, Vojtěch Havel, and Jiří Barnat,
LTL Model Checking of LLVM Bitcode with Symbolic Data
The correctness of parallel and reactive programs is often easier specified
using formulae of temporal logics. Yet verifying that a system satisfies such
specifications is more difficult than verifying safety properties: the
recurrence of a specific program state has to be detected. This paper reports
on the development of a generic framework for automatic verification of linear
temporal logic specifications for programs in LLVM bitcode. Our method
searches explicitly through all possible interleavings of parallel threads
(control non-determinism) but represents symbolically the variable evaluations
(data non-determinism), guided by the specification in order to prove the
correctness. To evaluate the framework we compare our method with
state-of-the-art tools on a set of unmodified C programs.
Stephan Beyer and Markus Chimani,
Steiner Tree 1.39-Approximation in Practice
We consider the currently strongest Steiner tree
that has recently been published by
Goemans, Olver, Rothvoß and Zenklusen~(2012).
It first solves a hypergraphic LP relaxation
and then applies matroid theory to obtain an integral solution.
The cost of the resulting Steiner tree is at most
(1.39 + ε)-times
the cost of an optimal Steiner tree
where ε tends to zero
as some parameter k tends to infinity.
the degree of the polynomial running time depends on this constant k,
so only small k are tractable in practice.
The algorithm has, to our knowledge, not been implemented
and evaluated in practice before.
We investigate different implementation aspects and parameter
choices of the algorithm
and compare tuned variants to an exact LP-based algorithm
as well as to fast and simple 2-approximations.
Jan Fiedor, Zdeněk Letko, Joao Lourenco and Tomáš Vojnar,
On Monitoring C/C++ Transactional Memory Programs
Transactional memory (TM) is an increasingly popular technique for synchronising
threads in multi-threaded programs. To address both correctness and
performance-related issues of TM programs, one needs to monitor and analyse
their execution. However, monitoring concurrent programs (including TM programs)
can have a non-negligible impact on their behaviour, which may hamper objectives
of the intended analysis. In this paper, we propose several approaches for
monitoring TM programs and study their impact on the behaviour of the monitored
programs. The considered approaches range from specialised lightweight
monitoring to generic heavyweight monitoring. The implemented monitoring tools
are publicly available for further applications, and the implementation
techniques used for lightweight monitoring can be used as an inspiration for
developing further specialised lightweight monitors.
Bent Functions Synthesis on Intel Xeon Phi Coprocessor
A new approach to synthesize bent Boolean functions by means of Cartesian Genetic
Programming (CGP) has been proposed recently.
Bent functions have important applications in cryptography due to their high
However, they are very rare and their discovery using conventional brute force
methods is not efficient enough.
In this paper, a new parallel implementation is proposed and the performance
is evaluated on the Intel Xeon Phi Coprocessor.
Vojtěch Nikl and Jiří Jaroš,
Parallelisation of the 3D Fast Fourier Transform Using the Hybrid OpenMP/MPI Decomposition
The 3D fast Fourier transform (FFT) is the heart of many simulation methods.
Although the efficient parallelisation of the FFT has been deeply studied over
last few decades, many researchers only focused on either pure message passing
(MPI) or shared memory (OpenMP) implementations. Unfortunately, pure MPI approaches
cannot exploit the shared memory within the cluster node and the OpenMP cannot
scale over multiple nodes.
This paper proposes a 2D hybrid decomposition of the 3D FFT where the domain
is decomposed over the first axis by means of MPI while over the second axis
by means of OpenMP. The performance of the proposed method is thoroughly compared
with the state of the art libraries (FFTW, PFFT, P3DFFT) on three supercomputer
systems with up to 16k cores. The experimental results show that the hybrid
implementation offers 10-20 % higher performance and better scaling especially
for high core counts.
Juraj Nižnan, Radek Pelánek, and Jiří Řihák,
Mapping Problems to Skills Combining Expert Opinion and Student Data
Construction of a mapping between educational content and skills is an
important part of development of adaptive educational systems. This task is
difficult, requires a domain expert, and any mistakes in the mapping may
hinder the potential of an educational system. In this work we study
techniques for improving a problem-skill mapping constructed by a domain
expert using student data, particularly problem solving times. We describe
and compare different techniques for the task - a multidimensional model of
problem solving times and supervised classification techniques. In the
evaluation we focus on surveying situations where the combination of expert
opinion with student data is most useful.
Karel Štěpka and Martin Falk,
Image Analysis of Gene Locus Positions within Chromosome Territories in Human Lymphocytes
One of the important areas of current cellular research with substantial impacts
on medicine is analyzing the spatial organization of genetic material within
the cell nuclei. Higher-order chromatin structure has been shown to play essential
roles in regulating fundamental cellular processes, like DNA transcription,
replication, and repair. In this paper, we present an image analysis method for
the localization of gene loci with regard to chromosomal territories they occupy
in 3D confocal microscopy images. We show that the segmentation of the territories
to obtain a precise position of the gene relative to a hard territory boundary
may lead to undesirable bias in the results; instead, we propose an approach
based on the evaluation of the relative chromatin density at the site of the
gene loci. This method yields softer, fuzzier ``boundaries'', characterized
by progressively decreasing chromatin density. The method therefore focuses
on the extent to which the signals are located inside the territories, rather
than a hard yes/no classification.
Vladimír Štill, Petr Ročkai, and Jiří Barnat,
Context-Switch-Directed Verification in DIVINE
In model checking of real-life C and C++ programs, both search efficiency
and counterexample readability are very important.
In this paper, we suggest context-switch-directed
exploration as a way to find a well-readable counterexample faster.
Furthermore, we allow to limit the number of context switches used in
state-space exploration if desired. The new algorithm is implemented in the DIVINE
model checker and enables both unbounded and bounded context-switch-directed
exploration for models given in LLVM bitcode, which efficiently allows
for verification of multi-threaded C and C++ programs.
A New Concept in Advice Complexity of Job Shop Scheduling
In online scheduling problems, we want to assign jobs to machines while optimizing
some given objective function. In the class we study in this paper, we are given
a number $m$ of machines and two jobs that both want to use each of the given
machines exactly once in some predefined order. Each job consists of $m$ tasks
and each task needs to be processed on one particular machine. The objective is
to assign the tasks to the machines while minimizing the makespan, i.e., the
processing time of the job that takes longer. In our model, the tasks arrive
in consecutive time steps and an algorithm must assign a task to a machine
without having full knowledge of the order in which the remaining tasks arrive.
We study the advice complexity of this problem, which is a tool to measure the
amount of information necessary to achieve a certain output quality. A great
deal of research has been carried out in this field; however, this paper studies
the problem in a new setting.
In this setting, the oracle does not know the exact future anymore but only all
possible future scenarios and their probabilities. This way, the additional
information becomes more realistic. We prove that the problem is more difficult
with this oracle than before. Moreover, in job shop scheduling, we provide
a lower bound of 1+1/(6√m) on the competitive ratio of any online algorithm
with advice and prove an upper bound of 1+1/√m on the competitive ratio
of an algorithm from Hromkovic et al..
Renata Avros, Vendula Hrubá, Bohuslav Křena, Zdeněk Letko, Hana Pluháčková,
Tomáš Vojnar, Zeev Volkovich, and Shmuel Ur
Boosted Decision Trees for Behaviour Mining of Concurrent Programs
Testing of concurrent programs is difficult since the scheduling
non-determinism requires one to test a huge number of different
thread interleavings. Moreover, a simple repetition of test executions
will typically examine similar interleavings only. One popular way how
to deal with this problem is to use the noise injection approach, which
is, however, parameterized with many parameters whose suitable values
are difficult to find. In this paper, we propose a novel application of
classification-based data mining for this purpose. Our approach can identify
which test and noise parameters are the most influential for a given
program and a given testing goal and which values (or ranges of values)
of these parameters are suitable for meeting this goal. We present experiments
that show that our approach can indeed fully automatically
improve noise-based testing of particular programs with a particular testing
goal. At the same time, we use it to obtain new general insights into
noise-based testing as well.
Peter Bezděk, Nikola Beneš, Jiří Barnat, and Ivana Černá
LTL model checking of Parametric Timed Automata
The parameter synthesis problem for timed automata is undecidable
in general even for very simple reachability properties. In this
paper we introduce restrictions on parameter valuations under which the
parameter synthesis problem is decidable for LTL properties. The proposed
problem could be solved using an explicit enumeration of all possible
parameter valuations. However, we introduce a symbolic zone-based
method for synthesising bounded integer parameters of parametric timed
automata with an LTL specification. Our method extends the ideas of
the standard automata-based approach to LTL model checking of timed
automata. Our solution employs constrained parametric difference bound
matrices and a suitable notion of extrapolation.
Tomáš Čejka, Lukáš Kekely, Pavel Benáček, Rudolf B. Blažek,
and Hana Kubátová
FPGA Accelerated Change-Point Detection Method for 100 Gb/s Networks
The aim of this paper is a hardware realization of a statistical
anomaly detection method as a part of high-speed monitoring probe
for computer networks. The sequential Non-Parametric Cumulative Sum
(NP-CUSUM) procedure is the detection method of our choice and we
use an FPGA based accelerator card as the target platform. For rapid
detection algorithm development, a high-level synthesis (HLS) approach
is applied. Furthermore, we combine HLS with the usage of Software
Defined Monitoring (SDM) framework on the monitoring probe, which
enables easy deployment of various hardware-accelerated monitoring applications
into high-speed networks. Our implementation of NP-CUSUM
algorithm serves as hardware plug-in for SDM and realizes the detection
of network attacks and anomalies directly in FPGA. Additionally, the
parallel nature of the FPGA technology allows us to realize multiple different
detections simultaneously without any losses in throughput. Our
experimental results show the feasibility of HLS and SDM combination
for effective realization of traffic analysis and anomaly detection in networks
with speeds up to 100Gb/s.
Milan Dvořák, Tomáš Závodník, and Jan Kořenek
Hardware Accelerated Book Handling with Unlimited Depth
Strong competitiveness between market participants on electronic
exchanges calls for continuing reduction of latency of trading systems.
Recent efforts are focused on hardware acceleration using FPGA
technology and running trading strategies directly in hardware to eliminate
high latency of system bus and software processing. For any trading
system, book handling is an important time-critical operation, which has
not been accelerated using the FPGA technology yet. Therefore we propose
a new hybrid hardware-software architecture that processes messages
from the exchange and creates the book with best buy and sell
prices. Based on the analysis of transactions on the exchange, we propose
to store only the most frequently accessed price levels in hardware
and keep the rest in the host memory managed by software. This enables
handling half of the whole stock universe (4 000 instruments) in a single
FPGA. Update of price levels in hardware takes only 240 ns, which is
two orders of magnitude faster than recent software implementations.
Throughput of the hardware unit is 75 million messages per second,
which is 140 times more than current peak rates of the market data.
Dušan Kolář and Peter Matula
Composite Data Type Recovery in a Retargetable Decompilation
Decompilation is a reverse engineering technique performing
a transformation of a platform-dependent binary file into a High Level
Language (HLL) representation. Despite its complexity, several decompilers
have been developed in recent years. They are not yet advanced
enough to serve as a standalone tool, but combined with the traditional
disassemblers, they allow much faster understanding of analysed machine
code. To achieve the necessary quality, many advanced analyses must be
performed. One of the toughest, but most rewarding, is the data type reconstruction
analysis. It aims to assign each object with a high level type,
preferably the same as in the original source code. This paper presents
the composite data type analysis used by the retargetable decompiler developed
within the Lissom project at FIT BUT. We design a whole new
address expression (AE) creation algorithm, which is both retargetable
and suitable for operating on code in Static Single Assignment (SSA)
form. Moreover, we devise a new AE aggregation rules that increase the
quality of recovered data types.
Vlastimil Košař and Jan Kořenek
Multi-Stride NFA-Split Architecture for Regular Expression Matching Using FPGA
Regular expression matching is a time critical operation for
any network security system. The NFA-Split is an efficient hardware architecture
to match a large set of regular expressions at multigigabit
speed with efficient FPGA logic utilization. Unfortunately, the matching
speed is limited by processing only single byte in one clock cycle.
Therefore, we propose new multi-stride NFA-Split architecture, which
increases achievable throughput by processing multiple bytes per clock
cycle. Moreover, we investigate efficiency of mapping DU to the FPGA
logic and propose new optimizations of mapping NFA-Split architecture
to the FPGA. These optimizations are able to reduce up to 71.85% of
FPGA LUTs and up to 94.18% of BlockRAMs.
Alexander Meduna and Ondřej Soukup
Computational Completeness Resulting from Scattered Context Grammars Working Under Various Derivation Modes
This paper introduces and studies a whole variety of derivation
modes in scattered context grammars. These grammars are conceptualized
just like classical scattered context grammars except that
during the applications of their rules, after erasing n nonterminals, they
can insert new substrings possibly at different positions than the original
occurrence of the erased nonterminal.
The paper concentrates its attention on investigating the generative
power of scattered context grammars working under these derivation
modes. It demonstrates that all of them are computationally complete -
that is, they characterize the family of recursively enumerable languages.
Jana Pazúriková and Luděk Matyska
Convergence of Parareal Algorithm Applied on Molecular Dynamics Simulations
Parallel and distributed computations based on the spatial
decomposition of the problem are beginning to fail to saturate large
supercomputers with their limited strong scalability. An application of
the high performance computing, a molecular dynamics simulation, shows
this limit especially in experiments with longer simulation times. When
the parallelism in space does not suffice, the parallelism in time could
cut the wallclock time at the expense of additional computational resources.
The parareal algorithm that decomposes the temporal domain
has been extensively researched and already applied to molecular dynamics
simulations, however with rather modest results. We propose a novel
modification that uses the coarse function with a simpler physics model,
not a longer timestep as in state-of-the-art methods. The evaluation with
a prototype implementation indicates that our method provides a rather
long time window for the parallel-in-time computation, a reasonable convergence
and stable properties of the simulated system.
A Case for a Multifaceted Fairness Model: An Overview of Fairness Methods for Job Queuing and Scheduling
Job scheduling for HPC and Grid-like systems, while being
a heavily studied subject, suffers from a particular disconnect between
theoretical approaches and practical applications. Most production systems
still rely on a small set of rather conservative scheduling policies.
One of the areas that tries to bridge the world of scientific research and
practical application is the study of fairness. Fairness in a system has
strong implications on customer satisfaction, with psychological studies
suggesting that perceived fairness is generally even more important than
the quality of service. This paper provides an overview of different approaches
for handling fairness in a job scheduling/queuing system. We
start with analytic approaches that rely on statistical modeling and try
to provide strong categorization and ordering of various scheduling policies
according to their fairness. Following that we provide an overview
of recent advancements that rely on simulations and use high resolution
analysis to extract fairness information from realistic job traces. As a
conclusion to this article, we propose a new direction for research. We
propose a novel multifaceted fairness approach, i.e., a combination of different
fairness models inside a single system, that could better capture
the heterogeneous fairness-related requirements of different users in the
system. It could serve as a solution to the shortcomings of some of the
methods presented in this paper.