内容简介:arXiv Paper Daily: Thu, 15 Jun 2017
Neural and Evolutionary Computing
A Fast Foveated Fully Convolutional Network Model for Human Peripheral Vision
Comments: NIPS 2017 submission
Subjects:
Neural and Evolutionary Computing (cs.NE)
; Computer Vision and Pattern Recognition (cs.CV); Neurons and Cognition (q-bio.NC)
Visualizing the information available to a human observer in a single glance
at an image provides a powerful tool for evaluating models of full-field human
vision. The hard part is human-realistic visualization of the periphery.
Degradation of information with distance from fixation is far more complex than
a mere reduction of acuity that might be mimicked using blur with a standard
deviation that linearly increases with eccentricity. Rather,
behaviorally-validated models hypothesize that peripheral vision measures a
large number of local texture statistics in pooling regions that overlap, grow
with eccentricity, and tile the visual field. We propose a “foveated” variant
of a fully convolutional network that approximates one such model. Our approach
achieves a 21,000 fold reduction in average running time (from 4.2 hours to 0.7
seconds per image), and statistically similar results to the
behaviorally-validated model.
MATIC: Adaptation and In-situ Canaries for Energy-Efficient Neural Network Acceleration
Sung Kim , Patrick Howe , Thierry Moreau , Armin Alaghi , Luis Ceze , Visvesh Sathe Subjects : Neural and Evolutionary Computing (cs.NE)
We present MATIC (Memory-Adaptive Training and In-situ Canaries), a voltage
scaling methodology that addresses the SRAM efficiency bottleneck in DNN
accelerators. To overscale DNN weight SRAMs, MATIC combines specific
characteristics of destructive SRAM reads with the error resilience of neural
networks in a memory-adaptive training process. PVT-related voltage margins are
eliminated using bit-cells from synaptic weights as in-situ canaries to track
runtime environmental variation. Demonstrated on a low-power DNN accelerator
fabricated in 65nm CMOS, MATIC enables up to 3.3x total energy reduction, or
18.6x application error reduction.
Neural Models for Key Phrase Detection and Question Generation
Sandeep Subramanian , Tong Wang , Xingdi Yuan , Adam Trischler Subjects : Computation and Language (cs.CL) ; Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
We propose several neural models arranged in a two-stage framework to tackle
question generation from documents. First, we estimate the probability of
“interesting” answers in a document using a neural model trained on a
question-answering corpus. The predicted key phrases are then used as answers
to condition a sequence-to-sequence question generation model. Empirically, our
neural key phrase detection models significantly outperform an entity-tagging
baseline system. We demonstrate that the question generator formulates good
quality natural language questions from extracted key phrases. The resulting
questions and answers can be used to assess reading comprehension in
educational settings.
Transfer entropy-based feedback improves performance in artificial neural networks
Sebastian Herzog , Christian Tetzlaff , Florentin Wörgötter Subjects : Learning (cs.LG) ; Information Theory (cs.IT); Neural and Evolutionary Computing (cs.NE)
The structure of the majority of modern deep neural networks is characterized
by uni- directional feed-forward connectivity across a very large number of
layers. By contrast, the architecture of the cortex of vertebrates contains
fewer hierarchical levels but many recurrent and feedback connections. Here we
show that a small, few-layer artificial neural network that employs feedback
will reach top level performance on a standard benchmark task, otherwise only
obtained by large feed-forward structures. To achieve this we use feed-forward
transfer entropy between neurons to structure feedback connectivity. Transfer
entropy can here intuitively be understood as a measure for the relevance of
certain pathways in the network, which are then amplified by feedback. Feedback
may therefore be key for high network performance in small brain-like
architectures.
Adversarially Regularized Autoencoders for Generating Discrete Structures
Junbo (Jake)
Zhao , Yoon Kim , Kelly Zhang , Alexander M. Rush , Yann LeCun Subjects : Learning (cs.LG) ; Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE)
Generative adversarial networks are an effective approach for learning rich
latent representations of continuous data, but have proven difficult to apply
directly to discrete structured data, such as text sequences or discretized
images. Ideally we could encode discrete structures in a continuous code space
to avoid this problem, but it is difficult to learn an appropriate
general-purpose encoder. In this work, we consider a simple approach for
handling these two challenges jointly, employing a discrete structure
autoencoder with a code space regularized by generative adversarial training.
The model learns a smooth regularized code space while still being able to
model the underlying data, and can be used as a discrete GAN with the ability
to generate coherent discrete outputs from continuous samples. We demonstrate
empirically how key properties of the data are captured in the model’s latent
space, and evaluate the model itself on the tasks of discrete image generation,
text generation, and semi-supervised learning.
Identifying Spatial Relations in Images using Convolutional Neural Networks
Mandar Haldekar , Ashwinkumar Ganesan , Tim Oates Subjects : Artificial Intelligence (cs.AI) ; Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
Traditional approaches to building a large scale knowledge graph have usually
relied on extracting information (entities, their properties, and relations
between them) from unstructured text (e.g. Dbpedia). Recent advances in
Convolutional Neural Networks (CNN) allow us to shift our focus to learning
entities and relations from images, as they build robust models that require
little or no pre-processing of the images. In this paper, we present an
approach to identify and extract spatial relations (e.g., The girl is standing
behind the table) from images using CNNs. Our research addresses two specific
challenges: providing insight into how spatial relations are learned by the
network and which parts of the image are used to predict these relations. We
use the pre-trained network VGGNet to extract features from an image and train
a Multi-layer Perceptron (MLP) on a set of synthetic images and the sun09
dataset to extract spatial relations. The MLP predicts spatial relations
without a bounding box around the objects or the space in the image depicting
the relation. To understand how the spatial relations are represented in the
network, a heatmap is overlayed on the image to show the regions that are
deemed important by the network. Also, we analyze the MLP to show the
relationship between the activation of consistent groups of nodes and the
prediction of a spatial relation. We show how the loss of these groups affects
the networks ability to identify relations.
Computer Vision and Pattern Recognition
Learning without Prejudice: Avoiding Bias in Webly-Supervised Action Recognition
Comments: Submitted to CVIU SI: Computer Vision and the Web
Subjects:
Computer Vision and Pattern Recognition (cs.CV)
Webly-supervised learning has recently emerged as an alternative paradigm to
traditional supervised learning based on large-scale datasets with manual
annotations. The key idea is that models such as CNNs can be learned from the
noisy visual data available on the web. In this work we aim to exploit web data
for video understanding tasks such as action recognition and detection. One of
the main problems in webly-supervised learning is cleaning the noisy labeled
data from the web. The state-of-the-art paradigm relies on training a first
classifier on noisy data that is then used to clean the remaining dataset. Our
key insight is that this procedure biases the second classifier towards samples
that the first one understands. Here we train two independent CNNs, a RGB
network on web images and video frames and a second network using temporal
information from optical flow. We show that training the networks independently
is vastly superior to selecting the frames for the flow classifier by using our
RGB network. Moreover, we show benefits in enriching the training set with
different data sources from heterogeneous public web databases. We demonstrate
that our framework outperforms all other webly-supervised methods on two public
benchmarks, UCF-101 and Thumos’14.
Learning local shape descriptors with view-based convolutional networks
Haibin Huang , Evangelos Kalogerakis , Siddhartha Chaudhuri , Duygu Ceylan , Vladimir G. Kim , Ersin Yumer Subjects : Computer Vision and Pattern Recognition (cs.CV) ; Graphics (cs.GR)
We present a new local descriptor for 3D shapes, directly applicable to a
wide range of shape analysis problems such as point correspondences, semantic
segmentation, affordance prediction, and shape-to-scan matching. Our key
insight is that the neighborhood of a point on a shape is effectively captured
at multiple scales by a succession of progressively zoomed out views, taken
from care fully selected camera positions. We propose a convolutional neural
network that uses local views around a point to embed it to a multidimensional
descriptor space, such that geometrically and semantically similar points are
close to one another. To train our network, we leverage two extremely large
sources of data. First, since our network processes 2D images, we repurpose
architectures pre-trained on massive image datasets. Second, we automatically
generate a synthetic dense correspondence dataset by part-aware, non-rigid
alignment of a massive collection of 3D models. As a result of these design
choices, our view-based architecture effectively encodes multi-scale local
context and fine-grained surface detail. We demonstrate through several
experiments that our learned local descriptors are more general and robust
compared to state of the art alternatives, and have a variety of applications
without any additional fine-tuning.
Large-Scale YouTube-8M Video Understanding with Deep Neural Networks
Comments: 6 pages, 5 figures, 3 tables
Subjects:
Computer Vision and Pattern Recognition (cs.CV)
Video classification problem has been studied many years. The success of
Convolutional Neural Networks (CNN) in image recognition tasks gives a powerful
incentive for researchers to create more advanced video classification
approaches. As video has a temporal content Long Short Term Memory (LSTM)
networks become handy tool allowing to model long-term temporal clues. Both
approaches need a large dataset of input data. In this paper three models
provided to address video classification using recently announced YouTube-8M
large-scale dataset. The first model is based on frame pooling approach. Two
other models based on LSTM networks. Mixture of Experts intermediate layer is
used in third model allowing to increase model capacity without dramatically
increasing computations. The set of experiments for handling imbalanced
training data has been conducted.
SalProp: Salient object proposals via aggregated edge cues
Comments: 5 pages, 4 figures, accepted at ICIP 2017
Subjects:
Computer Vision and Pattern Recognition (cs.CV)
In this paper, we propose a novel object proposal generation scheme by
formulating a graph-based salient edge classification framework that utilizes
the edge context. In the proposed method, we construct a Bayesian probabilistic
edge map to assign a saliency value to the edgelets by exploiting low level
edge features. A Conditional Random Field is then learned to effectively
combine these features for edge classification with object/non-object label. We
propose an objectness score for the generated windows by analyzing the salient
edge density inside the bounding box. Extensive experiments on PASCAL VOC 2007
dataset demonstrate that the proposed method gives competitive performance
against 10 popular generic object detection techniques while using fewer number
of proposals.
(ν)-net: Deep Learning for Generalized Biventricular Cardiac Mass and Function Parameters
Hinrich B Winther , Christian Hundt , Bertil Schmidt , Christoph Czerner , Johann Bauersachs , Frank Wacker , Jens Vogel-Claussen Subjects : Computer Vision and Pattern Recognition (cs.CV) ; Machine Learning (stat.ML)
Background: Cardiac MRI derived biventricular mass and function parameters,
such as end-systolic volume (ESV), end-diastolic volume (EDV), ejection
fraction (EF), stroke volume (SV), and ventricular mass (VM) are clinically
well established. Image segmentation can be challenging and time-consuming, due
to the complex anatomy of the human heart.
Objectives: This study introduces (
u)-net (/nju:n(varepsilon)t/) — a deep
learning approach allowing for fully-automated high quality segmentation of
right (RV) and left ventricular (LV) endocardium and epicardium for extraction
of cardiac function parameters.
Methods: A set consisting of 253 manually segmented cases has been used to
train a deep neural network. Subsequently, the network has been evaluated on 4
different multicenter data sets with a total of over 1000 cases.
Results: For LV EF the intraclass correlation coefficient (ICC) is 98, 95,
and 80 % (95 %), and for RV EF 96, and 87 % (80 %) on the respective data sets
(human expert ICCs reported in parenthesis). The LV VM ICC is 95, and 94 % (84
%), and the RV VM ICC is 83, and 83 % (54 %). This study proposes a simple
adjustment procedure, allowing for the adaptation to distinct segmentation
philosophies. (
u)-net exhibits state of-the-art performance in terms of dice
coefficient.
Conclusions: Biventricular mass and function parameters can be determined
reliably in high quality by applying a deep neural network for cardiac MRI
segmentation, especially in the anatomically complex right ventricle. Adaption
to individual segmentation styles by applying a simple adjustment procedure is
viable, allowing for the processing of novel data without time-consuming
additional training.
Alignment Distances on Systems of Bags
Alexander Sagel , Martin Kleinsteuber Subjects : Computer Vision and Pattern Recognition (cs.CV)
Recent research in image and video recognition indicates that many visual
processes can be thought of as being generated by a time-varying generative
model. A nearby descriptive model for visual processes is thus a statistical
distribution that varies over time. Specifically, modeling visual processes as
streams of histograms generated by a kernelized linear dynamic system turns out
to be efficient. We refer to such a model as a System of Bags. In this work, we
investigate Systems of Bags with special emphasis on dynamic scenes and dynamic
textures. Parameters of linear dynamic systems suffer from ambiguities. In
order to cope with these ambiguities in the kernelized setting, we develop a
kernelized version of the alignment distance. For its computation, we use a
Jacobi-type method and prove its convergence to a set of critical points. We
employ it as a dissimilarity measure on Systems of Bags. As such, it
outperforms other known dissimilarity measures for kernelized linear dynamic
systems, in particular the Martin Distance and the Maximum Singular Value
Distance, in every tested classification setting. A considerable margin can be
observed in settings, where classification is performed with respect to an
abstract mean of video sets. For this scenario, the presented approach can
outperform state-of-the-art techniques, such as Dynamic Fractal Spectrum or
Orthogonal Tensor Dictionary Learning.
Shape-Color Differential Moment Invariants under Affine Transformations
Comments: 13 pages, 4 figures
Subjects:
Computer Vision and Pattern Recognition (cs.CV)
We propose the general construction formula of shape-color primitives by
using partial differentials of each color channel in this paper. By using all
kinds of shape-color primitives, shape-color differential moment invariants can
be constructed very easily, which are invariant to the shape affine and color
affine transforms. 50 instances of SCDMIs are obtained finally. In experiments,
several commonly used color descriptors and SCDMIs are used in image
classification and retrieval of color images, respectively. By comparing the
experimental results, we find that SCDMIs get better results.
Zoom-in-Net: Deep Mining Lesions for Diabetic Retinopathy Detection
Comments: accepted by MICCAI 2017
Subjects:
Computer Vision and Pattern Recognition (cs.CV)
We propose a convolution neural network based algorithm for simultaneously
diagnosing diabetic retinopathy and highlighting suspicious regions. Our
contributions are two folds: 1) a network termed Zoom-in-Net which mimics the
zoom-in process of a clinician to examine the retinal images. Trained with only
image-level supervisions, Zoomin-Net can generate attention maps which
highlight suspicious regions, and predicts the disease level accurately based
on both the whole image and its high resolution suspicious patches. 2) Only
four bounding boxes generated from the automatically learned attention maps are
enough to cover 80% of the lesions labeled by an experienced ophthalmologist,
which shows good localization ability of the attention maps. By clustering
features at high response locations on the attention maps, we discover
meaningful clusters which contain potential lesions in diabetic retinopathy.
Experiments show that our algorithm outperform the state-of-the-art methods on
two datasets, EyePACS and Messidor.
Hierarchical Gaussian Descriptors with Application to Person Re-Identification
Comments: 14 pages, 12 figures, 4 tables
Subjects:
Computer Vision and Pattern Recognition (cs.CV)
Describing the color and textural information of a person image is one of the
most crucial aspects of person re-identification (re-id). In this paper, we
present novel meta-descriptors based on a hierarchical distribution of pixel
features. Although hierarchical covariance descriptors have been successfully
applied to image classification, the mean information of pixel features, which
is absent from the covariance, tends to be the major discriminative information
for person re-id. To solve this problem, we describe a local region in an image
via hierarchical Gaussian distribution in which both means and covariances are
included in their parameters. More specifically, the region is modeled as a set
of multiple Gaussian distributions in which each Gaussian represents the
appearance of a local patch. The characteristics of the set of Gaussians are
again described by another Gaussian distribution. In both steps, we embed the
parameters of the Gaussian into a point of Symmetric Positive Definite (SPD)
matrix manifold. By changing the way to handle mean information in this
embedding, we develop two hierarchical Gaussian descriptors. Additionally, we
develop feature norm normalization methods with the ability to alleviate the
biased trends that exist on the descriptors. The experimental results conducted
on five public datasets indicate that the proposed descriptors achieve
remarkably high performance on person re-id.
Teaching Compositionality to CNNs
Comments: Preprint appearing in CVPR 2017
Subjects:
Computer Vision and Pattern Recognition (cs.CV)
; Learning (cs.LG)
Convolutional neural networks (CNNs) have shown great success in computer
vision, approaching human-level performance when trained for specific tasks via
application-specific loss functions. In this paper, we propose a method for
augmenting and training CNNs so that their learned features are compositional.
It encourages networks to form representations that disentangle objects from
their surroundings and from each other, thereby promoting better
generalization. Our method is agnostic to the specific details of the
underlying CNN to which it is applied and can in principle be used with any
CNN. As we show in our experiments, the learned representations lead to feature
activations that are more localized and improve performance over
non-compositional baselines in object recognition tasks.
Photo-realistic Facial Texture Transfer
Parneet Kaur , Hang Zhang , Kristin J. Dana Subjects : Computer Vision and Pattern Recognition (cs.CV)
Style transfer methods have achieved significant success in recent years with
the use of convolutional neural networks. However, many of these methods
concentrate on artistic style transfer with few constraints on the output image
appearance. We address the challenging problem of transferring face texture
from a style face image to a content face image in a photorealistic manner
without changing the identity of the original content image. Our framework for
face texture transfer (FaceTex) augments the prior work of MRF-CNN with a novel
facial semantic regularization that incorporates a face prior regularization
smoothly suppressing the changes around facial meso-structures (e.g eyes, nose
and mouth) and a facial structure loss function which implicitly preserves the
facial structure so that face texture can be transferred without changing the
original identity. We demonstrate results on face images and compare our
approach with recent state-of-the-art methods. Our results demonstrate superior
texture transfer because of the ability to maintain the identity of the
original face image.
Comments: MICCAI 2017 accepted
Subjects:
Computer Vision and Pattern Recognition (cs.CV)
Early detection of pulmonary cancer is the most promising way to enhance a
patient’s chance for survival. Accurate pulmonary nodule detection in computed
tomography (CT) images is a crucial step in diagnosing pulmonary cancer. In
this paper, inspired by the successful use of deep convolutional neural
networks (DCNNs) in natural image recognition, we propose a novel pulmonary
nodule detection approach based on DCNNs. We first introduce a deconvolutional
structure to Faster Region-based Convolutional Neural Network (Faster R-CNN)
for candidate detection on axial slices. Then, a three-dimensional DCNN is
presented for the subsequent false positive reduction. Experimental results of
the LUng Nodule Analysis 2016 (LUNA16) Challenge demonstrate the superior
detection performance of the proposed approach on nodule detection (average
FROC-score of 0.893, ranking the 1st place over all submitted results), which
outperforms the best result on the leaderboard of the LUNA16 Challenge (average
FROC-score of 0.864).
Saliency detection by aggregating complementary background template with optimization framework
Comments: 28 pages,10 figures
Subjects:
Computer Vision and Pattern Recognition (cs.CV)
This paper proposes an unsupervised bottom-up saliency detection approach by
aggregating complementary background template with refinement. Feature vectors
are extracted from each superpixel to cover regional color, contrast and
texture information. By using these features, a coarse detection for salient
region is realized based on background template achieved by different
combinations of boundary regions instead of only treating four boundaries as
background. Then, by ranking the relevance of the image nodes with foreground
cues extracted from the former saliency map, we obtain an improved result.
Finally, smoothing operation is utilized to refine the foreground-based
saliency map to improve the contrast between salient and non-salient regions
until a close to binary saliency map is reached. Experimental results show that
the proposed algorithm generates more accurate saliency maps and performs
favorably against the state-off-the-art saliency detection methods on four
publicly available datasets.
When Image Denoising Meets High-Level Vision Tasks: A Deep Learning Approach
Ding Liu , Bihan Wen , Xianming Liu , Thomas S. Huang Subjects : Computer Vision and Pattern Recognition (cs.CV)
Conventionally, image denoising and high-level vision tasks are handled
separately in computer vision, and their connection is fragile. In this paper,
we cope with the two jointly and explore the mutual influence between them,
with the focus on two questions, namely (1) how image denoising can help
solving high-level vision problems, and (2) how the semantic information from
high-level vision tasks can be used to guide image denoising. We propose a deep
convolutional neural network solution that cascades two modules for image
denoising and various high level tasks, respectively, and propose the use of
joint loss for training to allow the semantic information flowing into the
optimization of the denoising network via back-propagation. Our experimental
results demonstrate that the proposed architecture not only yields superior
image denoising results preserving fine details, but also overcomes the
performance degradation of different high-level vision tasks, e.g., image
classification and semantic segmentation, due to image noise or artifacts
caused by conventional denoising approaches such as over-smoothing.
Comments: submitted to Journal of Visual Communication and Image Representation. 26 pages, 7 figures, 7 tables
Subjects:
Computer Vision and Pattern Recognition (cs.CV)
Gender classification aims at recognizing a person’s gender. Despite the high
accuracy achieved by state-of-the-art methods for this task, there still room
for improvement in generalized and unrestricted datasets. In this paper, we
advocate a new strategy inspired by the behavior of humans in gender
recognition. Instead of dealing with the face image as a sole feature, we rely
on the combination of isolated facial features and a holistic feature which we
call the foggy face. Then, we use these features to train deep convolutional
neural networks followed by an AdaBoost-based score fusion to infer the final
gender class. We evaluate our method on four challenging datasets to
demonstrate its efficacy in achieving better or on-par accuracy with
state-of-the-art methods. In addition, we present a new face dataset that
intensifies the challenges of occluded faces and illumination changes, which we
believe to be a much-needed resource for gender classification research.
Action Search: Learning to Search for Human Activities in Untrimmed Videos
Comments: 9 pages, 9 figures
Subjects:
Computer Vision and Pattern Recognition (cs.CV)
Traditional approaches for action detection use trimmed data to learn
sophisticated action detector models. Although these methods have achieved
great success at detecting human actions, we argue that huge information is
discarded when ignoring the process, through which this trimmed data is
obtained. In this paper, we propose Action Search, a novel approach that mimics
the way people annotate activities in video sequences. Using a Recurrent Neural
Network, Action Search can efficiently explore a video and determine the time
boundaries during which an action occurs. Experiments on the THUMOS14 dataset
reveal that our model is not only able to explore the video efficiently but
also accurately find human activities, outperforming state-of-the-art methods.
von Mises-Fisher Mixture Model-based Deep learning: Application to Face Verification
Comments: Under review
Subjects:
Computer Vision and Pattern Recognition (cs.CV)
A number of pattern recognition tasks, e.g., face verification, can be boiled
down to classification or clustering of unit length directional feature vectors
whose distance can be simply computed by their angle. In this paper, we propose
the von Mises-Fisher (vMF) mixture model as the theoretical foundation for an
effective deep-learning of such directional features and derive a novel vMF
Mixture Loss and its corresponding vMF deep features. The proposed vMF features
learning achieves a discriminative learning, i.e., compacting the instances of
the same class while increasing the distance of instances from different
classes, and subsumes a number of loss functions or deep learning practice,
e.g., normalization. The experiments carried out on face verification using 4
different challenging face datasets, i.e., LFW, IJB-A, YouTube Faces and CACD,
show the effectiveness of the proposed approach, which displays very
competitive and state-of-the-art results.
The "something something" video database for learning and evaluating visual common sense
Raghav Goyal , Samira Kahou , Vincent Michalski , Joanna Materzyńska , Susanne Westphal , Heuna Kim , Valentin Haenel , Ingo Fruend , Peter Yianilos , Moritz Mueller-Freitag , Florian Hoppe , Christian Thurau , Ingo Bax , Roland Memisevic Subjects : Computer Vision and Pattern Recognition (cs.CV)
Neural networks trained on datasets such as ImageNet have led to major
advances in visual object classification. One obstacle that prevents networks
from reasoning more deeply about complex scenes and situations, and from
integrating visual knowledge with natural language, like humans do, is their
lack of common sense knowledge about the physical world. Videos, unlike still
images, contain a wealth of detailed information about the physical world.
However, most labelled video datasets represent high-level concepts rather than
detailed physical aspects about actions and scenes. In this work, we describe
our ongoing collection of the “something-something” database of video
prediction tasks whose solutions require a common sense understanding of the
depicted situation. The database currently contains more than 100,000 videos
across 174 classes, which are defined as caption-templates. We also describe
the challenges in crowd-sourcing this data at scale.
Online Convolutional Dictionary Learning for Multimodal Imaging
Kevin Degraux , Ulugbek S. Kamilov , Petros T. Boufounos , Dehong Liu Subjects : Computer Vision and Pattern Recognition (cs.CV)
Computational imaging methods that can exploit multiple modalities have the
potential to enhance the capabilities of traditional sensing systems. In this
paper, we propose a new method that reconstructs multimodal images from their
linear measurements by exploiting redundancies across different modalities. Our
method combines a convolutional group-sparse representation of images with
total variation (TV) regularization for high-quality multimodal imaging. We
develop an online algorithm that enables the unsupervised learning of
convolutional dictionaries on large-scale datasets that are typical in such
applications. We illustrate the benefit of our approach in the context of joint
intensity-depth imaging.
Automatic Localization of Deep Stimulation Electrodes Using Trajectory-based Segmentation Approach
Comments: 13 pages, 5 figures
Subjects:
Computer Vision and Pattern Recognition (cs.CV)
; Neurons and Cognition (q-bio.NC)
Parkinson’s disease (PD) is a degenerative condition of the nervous system,
which manifests itself primarily as muscle stiffness, hypokinesia,
bradykinesia, and tremor. In patients suffering from advanced stages of PD,
Deep Brain Stimulation neurosurgery (DBS) is the best alternative to medical
treatment, especially when they become tolerant to the drugs. This surgery
produces a neuronal activity, a result from electrical stimulation, whose
quantification is known as Volume of Tissue Activated (VTA). To locate
correctly the VTA in the cerebral volume space, one should be aware exactly the
location of the tip of the DBS electrodes, as well as their spatial projection.
In this paper, we automatically locate DBS electrodes using a threshold-based
medical imaging segmentation methodology, determining the optimal value of this
threshold adaptively. The proposed methodology allows the localization of DBS
electrodes in Computed Tomography (CT) images, with high noise tolerance, using
automatic threshold detection methods.
Deep Learning Methods for Efficient Large Scale Video Labeling
Comments: 7 pages, 5 tables, 1 figure
Subjects:
Machine Learning (stat.ML)
; Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
We present a solution to “Google Cloud and YouTube-8M Video Understanding
Challenge” that ranked 5th place. The proposed model is an ensemble of three
model families, two frame level and one video level. The training was performed
on augmented dataset, with cross validation.
A Fast Foveated Fully Convolutional Network Model for Human Peripheral Vision
Comments: NIPS 2017 submission
Subjects:
Neural and Evolutionary Computing (cs.NE)
; Computer Vision and Pattern Recognition (cs.CV); Neurons and Cognition (q-bio.NC)
Visualizing the information available to a human observer in a single glance
at an image provides a powerful tool for evaluating models of full-field human
vision. The hard part is human-realistic visualization of the periphery.
Degradation of information with distance from fixation is far more complex than
a mere reduction of acuity that might be mimicked using blur with a standard
deviation that linearly increases with eccentricity. Rather,
behaviorally-validated models hypothesize that peripheral vision measures a
large number of local texture statistics in pooling regions that overlap, grow
with eccentricity, and tile the visual field. We propose a “foveated” variant
of a fully convolutional network that approximates one such model. Our approach
achieves a 21,000 fold reduction in average running time (from 4.2 hours to 0.7
seconds per image), and statistically similar results to the
behaviorally-validated model.
Modeling Multimodal Clues in a Hybrid Deep Learning Framework for Video Classification
Yu-Gang Jiang , Zuxuan Wu , Jinhui Tang , Zechao Li , Xiangyang Xue , Shih-Fu Chang Subjects : Multimedia (cs.MM) ; Computer Vision and Pattern Recognition (cs.CV)
Videos are inherently multimodal. This paper studies the problem of how to
fully exploit the abundant multimodal clues for improved video categorization.
We introduce a hybrid deep learning framework that integrates useful clues from
multiple modalities, including static spatial appearance information, motion
patterns within a short time window, audio information as well as long-range
temporal dynamics. More specifically, we utilize three Convolutional Neural
Networks (CNNs) operating on appearance, motion and audio signals to extract
their corresponding features. We then employ a feature fusion network to derive
a unified representation with an aim to capture the relationships among
features. Furthermore, to exploit the long-range temporal dynamics in videos,
we apply two Long Short Term Memory networks with extracted appearance and
motion features as inputs. Finally, we also propose to refine the prediction
scores by leveraging contextual relationships among video semantics. The hybrid
deep learning framework is able to exploit a comprehensive set of multimodal
features for video classification. Through an extensive set of experiments, we
demonstrate that (1) LSTM networks which model sequences in an explicitly
recurrent manner are highly complementary with CNN models; (2) the feature
fusion network which produces a fused representation through modeling feature
relationships outperforms alternative fusion strategies; (3) the semantic
context of video classes can help further refine the predictions for improved
performance. Experimental results on two challenging benchmarks, the UCF-101
and the Columbia Consumer Videos (CCV), provide strong quantitative evidence
that our framework achieves promising results: (93.1\%) on the UCF-101 and
(84.5\%) on the CCV, outperforming competing methods with clear margins.
Enhanced discrete particle swarm optimization path planning for UAV vision-based surface inspection
Journal-ref: Automation in Construction, Vol.81, pp.25-33 (2017)
Subjects:
Robotics (cs.RO)
; Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
In built infrastructure monitoring, an efficient path planning algorithm is
essential for robotic inspection of large surfaces using computer vision. In
this work, we first formulate the inspection path planning problem as an
extended travelling salesman problem (TSP) in which both the coverage and
obstacle avoidance were taken into account. An enhanced discrete particle swarm
optimization (DPSO) algorithm is then proposed to solve the TSP, with
performance improvement by using deterministic initialization, random mutation,
and edge exchange. Finally, we take advantage of parallel computing to
implement the DPSO in a GPU-based framework so that the computation time can be
significantly reduced while keeping the hardware requirement unchanged. To show
the effectiveness of the proposed algorithm, experimental results are included
for datasets obtained from UAV inspection of an office building and a bridge.
Identifying Spatial Relations in Images using Convolutional Neural Networks
Mandar Haldekar , Ashwinkumar Ganesan , Tim Oates Subjects : Artificial Intelligence (cs.AI) ; Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
Traditional approaches to building a large scale knowledge graph have usually
relied on extracting information (entities, their properties, and relations
between them) from unstructured text (e.g. Dbpedia). Recent advances in
Convolutional Neural Networks (CNN) allow us to shift our focus to learning
entities and relations from images, as they build robust models that require
little or no pre-processing of the images. In this paper, we present an
approach to identify and extract spatial relations (e.g., The girl is standing
behind the table) from images using CNNs. Our research addresses two specific
challenges: providing insight into how spatial relations are learned by the
network and which parts of the image are used to predict these relations. We
use the pre-trained network VGGNet to extract features from an image and train
a Multi-layer Perceptron (MLP) on a set of synthetic images and the sun09
dataset to extract spatial relations. The MLP predicts spatial relations
without a bounding box around the objects or the space in the image depicting
the relation. To understand how the spatial relations are represented in the
network, a heatmap is overlayed on the image to show the regions that are
deemed important by the network. Also, we analyze the MLP to show the
relationship between the activation of consistent groups of nodes and the
prediction of a spatial relation. We show how the loss of these groups affects
the networks ability to identify relations.
Artificial Intelligence
The Opacity of Backbones and Backdoors Under a Weak Assumption
Lane A. Hemaspaandra , David E. Narváez Subjects : Artificial Intelligence (cs.AI) ; Computational Complexity (cs.CC); Logic in Computer Science (cs.LO)
Backdoors and backbones of Boolean formulas are hidden structural properties
that are relevant to the analysis of the hardness of instances of the SAT
problem. The development and analysis of algorithms to find and make use of
these properties is thus useful to improve the performance of modern solvers
and our general understanding of SAT. In this work we show that, under the
assumption that P(
eq)NP, there are easily-recognizable sets of Boolean
formulas for which it is hard to determine whether they have a backbone. We
also show that, under the same assumption, there are easily-recognizable
families of Boolean formulas with strong backdoors that are easy to find, for
which it is hard to determine whether they are satisfiable or not.
Simultaneous merging multiple grid maps using the robust motion averaging
Zutao Jiang , Jihua Zhu , Yaochen Li , Zhongyu Li , Huimin Lu Subjects : Artificial Intelligence (cs.AI) ; Robotics (cs.RO)
Mapping in the GPS-denied environment is an important and challenging task in
the field of robotics. In the large environment, mapping can be significantly
accelerated by multiple robots exploring different parts of the environment.
Accordingly, a key problem is how to integrate these local maps built by
different robots into a single global map. In this paper, we propose an
approach for simultaneous merging of multiple grid maps by the robust motion
averaging. The main idea of this approach is to recover all global motions for
map merging from a set of relative motions. Therefore, it firstly adopts the
pair-wise map merging method to estimate relative motions for grid map pairs.
To obtain as many reliable relative motions as possible, a graph-based sampling
scheme is utilized to efficiently remove unreliable relative motions obtained
from the pair-wise map merging. Subsequently, the accurate global motions can
be recovered from the set of reliable relative motions by the motion averaging.
Experimental results carried on real robot data sets demonstrate that proposed
approach can achieve simultaneous merging of multiple grid maps with good
performances.
Schema Networks: Zero-shot Transfer with a Generative Causal Model of Intuitive Physics
Ken Kansky , Tom Silver , David A. Mély , Mohamed Eldawy , Miguel Lázaro-Gredilla , Xinghua Lou , Nimrod Dorfman , Szymon Sidor , Scott Phoenix , Dileep George Subjects : Artificial Intelligence (cs.AI)
The recent adaptation of deep neural network-based methods to reinforcement
learning and planning domains has yielded remarkable progress on individual
tasks. Nonetheless, progress on task-to-task transfer remains limited. In
pursuit of efficient and robust generalization, we introduce the Schema
Network, an object-oriented generative physics simulator capable of
disentangling multiple causes of events and reasoning backward through causes
to achieve goals. The richly structured architecture of the Schema Network can
learn the dynamics of an environment directly from data. We compare Schema
Networks with Asynchronous Advantage Actor-Critic and Progressive Networks on a
suite of Breakout variations, reporting results on training efficiency and
zero-shot generalization, consistently demonstrating faster, more robust
learning and better transfer. We argue that generalizing from limited data and
learning causal relationships are essential abilities on the path toward
generally intelligent systems.
Identifying Spatial Relations in Images using Convolutional Neural Networks
Mandar Haldekar , Ashwinkumar Ganesan , Tim Oates Subjects : Artificial Intelligence (cs.AI) ; Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
Traditional approaches to building a large scale knowledge graph have usually
relied on extracting information (entities, their properties, and relations
between them) from unstructured text (e.g. Dbpedia). Recent advances in
Convolutional Neural Networks (CNN) allow us to shift our focus to learning
entities and relations from images, as they build robust models that require
little or no pre-processing of the images. In this paper, we present an
approach to identify and extract spatial relations (e.g., The girl is standing
behind the table) from images using CNNs. Our research addresses two specific
challenges: providing insight into how spatial relations are learned by the
network and which parts of the image are used to predict these relations. We
use the pre-trained network VGGNet to extract features from an image and train
a Multi-layer Perceptron (MLP) on a set of synthetic images and the sun09
dataset to extract spatial relations. The MLP predicts spatial relations
without a bounding box around the objects or the space in the image depicting
the relation. To understand how the spatial relations are represented in the
network, a heatmap is overlayed on the image to show the regions that are
deemed important by the network. Also, we analyze the MLP to show the
relationship between the activation of consistent groups of nodes and the
prediction of a spatial relation. We show how the loss of these groups affects
the networks ability to identify relations.
Neural Models for Key Phrase Detection and Question Generation
Sandeep Subramanian , Tong Wang , Xingdi Yuan , Adam Trischler Subjects : Computation and Language (cs.CL) ; Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
We propose several neural models arranged in a two-stage framework to tackle
question generation from documents. First, we estimate the probability of
“interesting” answers in a document using a neural model trained on a
question-answering corpus. The predicted key phrases are then used as answers
to condition a sequence-to-sequence question generation model. Empirically, our
neural key phrase detection models significantly outperform an entity-tagging
baseline system. We demonstrate that the question generator formulates good
quality natural language questions from extracted key phrases. The resulting
questions and answers can be used to assess reading comprehension in
educational settings.
Learning and Evaluating Musical Features with Deep Autoencoders
Mason Bretan , Sageev Oore , Doug Eck , Larry Heck Subjects : Sound (cs.SD) ; Artificial Intelligence (cs.AI)
In this work we describe and evaluate methods to learn musical embeddings.
Each embedding is a vector that represents four contiguous beats of music and
is derived from a symbolic representation. We consider autoencoding-based
methods including denoising autoencoders, and context reconstruction, and
evaluate the resulting embeddings on a forward prediction and a classification
task.
Enhanced discrete particle swarm optimization path planning for UAV vision-based surface inspection
Journal-ref: Automation in Construction, Vol.81, pp.25-33 (2017)
Subjects:
Robotics (cs.RO)
; Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
In built infrastructure monitoring, an efficient path planning algorithm is
essential for robotic inspection of large surfaces using computer vision. In
this work, we first formulate the inspection path planning problem as an
extended travelling salesman problem (TSP) in which both the coverage and
obstacle avoidance were taken into account. An enhanced discrete particle swarm
optimization (DPSO) algorithm is then proposed to solve the TSP, with
performance improvement by using deterministic initialization, random mutation,
and edge exchange. Finally, we take advantage of parallel computing to
implement the DPSO in a GPU-based framework so that the computation time can be
significantly reduced while keeping the hardware requirement unchanged. To show
the effectiveness of the proposed algorithm, experimental results are included
for datasets obtained from UAV inspection of an office building and a bridge.
Optimization by a quantum reinforcement algorithm
Comments: 11 pages, 5 figures
Subjects:
Disordered Systems and Neural Networks (cond-mat.dis-nn)
; Statistical Mechanics (cond-mat.stat-mech); Artificial Intelligence (cs.AI); Learning (cs.LG); Quantum Physics (quant-ph)
A reinforcement algorithm solves a classical optimization problem by
introducing a feedback to the system which slowly changes the energy landscape
and converges the algorithm to an optimal solution in the configuration space.
Here, we use this strategy to concentrate (localize) preferentially the wave
function of a quantum particle, which explores the configuration space of the
problem, on an optimal configuration. We examine the method by solving
numerically the equations governing the evolution of the system, which are
similar to the nonlinear Schr”odinger equations, for small problem sizes. In
particular, we observe that reinforcement increases the minimal energy gap of
the system in a quantum annealing algorithm. Our numerical simulations and the
latter observation show that such kind of quantum feedbacks might be helpful in
solving a computationally hard optimization problem by a quantum reinforcement
algorithm.
Comments: Thesis of PhD completed at Flinders University of South Australia, 2017
Subjects:
Robotics (cs.RO)
; Artificial Intelligence (cs.AI)
An Autonomous Underwater Vehicle (AUV) should carry out complex tasks in a
limited time interval. Since existing AUVs have limited battery capacity and
restricted endurance, they should autonomously manage mission time and the
resources to perform effective persistent deployment in longer missions. Task
assignment requires making decisions subject to resource constraints, while
tasks are assigned with costs and/or values that are budgeted in advance. Tasks
are distributed in a particular operation zone and mapped by a waypoint covered
network. Thus, design an efficient routing-task priority assign framework
considering vehicle’s availabilities and properties is essential for increasing
mission productivity and on-time mission completion. This depends strongly on
the order and priority of the tasks that are located between node-like
waypoints in an operation network. On the other hand, autonomous operation of
AUVs in an unfamiliar dynamic underwater and performing quick response to
sudden environmental changes is a complicated process. Water current
instabilities can deflect the vehicle to an undesired direction and perturb
AUVs safety. The vehicle’s robustness to strong environmental variations is
extremely crucial for its safe and optimum operations in an uncertain and
dynamic environment. To this end, the AUV needs to have a general overview of
the environment in top level to perform an autonomous action selection (task
selection) and a lower level local motion planner to operate successfully in
dealing with continuously changing situations. This research deals with
developing a novel reactive control architecture to provide a higher level of
decision autonomy for the AUV operation that enables a single vehicle to
accomplish multiple tasks in a single mission in the face of periodic
disturbances in a turbulent and highly uncertain environment.
Information Retrieval
Hybrid Collaborative Recommendation via Semi-AutoEncoder
Comments: 10 pages
Subjects:
Information Retrieval (cs.IR)
In this paper, we first present a novel structure of AutoEncoder, namely
Semi-AutoEncoder. We generalize it into a designated hybrid collaborative
filtering model, which is able to predict ratings as well as to generate
personalized top-N recommendations. Experimental results on two real-world
datasets demonstrate its state-of-the-art performances.
Evaluating Personal Assistants on Mobile devices
Julia Kiseleva , Maarten de Rijke Subjects : Human-Computer Interaction (cs.HC) ; Information Retrieval (cs.IR)
The iPhone was introduced only a decade ago in 2007 but has fundamentally
changed the way we interact with online information. Mobile devices differ
radically from classic command-based and point-and-click user interfaces, now
allowing for gesture-based interaction using fine-grained touch and swipe
signals. Due to the rapid growth in the use of voice-controlled intelligent
personal assistants on mobile devices, such as Microsoft’s Cortana, Google Now,
and Apple’s Siri, mobile devices have become personal, allowing us to be online
all the time, and assist us in any task, both in work and in our daily lives,
making context a crucial factor to consider.
Mobile usage is now exceeding desktop usage, and is still growing at a rapid
rate, yet our main ways of training and evaluating personal assistants are
still based on (and framed in) classical desktop interactions, focusing on
explicit queries, clicks, and dwell time spent. However, modern user
interaction with mobile devices is radically different due to touch screens
with a gesture- and voice-based control and the varying context of use, e.g.,
in a car, by bike, often invalidating the assumptions underlying today’s user
satisfaction evaluation.
There is an urgent need to understand voice- and gesture-based interaction,
taking all interaction signals and context into account in appropriate ways. We
propose a research agenda for developing methods to evaluate and improve
context-aware user satisfaction with mobile interactions using gesture-based
signals at scale.
Identifying Condition-Action Statements in Medical Guidelines Using Domain-Independent Features
Hossein Hematialam , Wlodek Zadrozny Subjects : Computation and Language (cs.CL) ; Information Retrieval (cs.IR)
This paper advances the state of the art in text understanding of medical
guidelines by releasing two new annotated clinical guidelines datasets, and
establishing baselines for using machine learning to extract condition-action
pairs. In contrast to prior work that relies on manually created rules, we
report experiment with several supervised machine learning techniques to
classify sentences as to whether they express conditions and actions. We show
the limitations and possible extensions of this work on text mining of medical
guidelines.
Computation and Language
Neural Models for Key Phrase Detection and Question Generation
Sandeep Subramanian , Tong Wang , Xingdi Yuan , Adam Trischler Subjects : Computation and Language (cs.CL) ; Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
We propose several neural models arranged in a two-stage framework to tackle
question generation from documents. First, we estimate the probability of
“interesting” answers in a document using a neural model trained on a
question-answering corpus. The predicted key phrases are then used as answers
to condition a sequence-to-sequence question generation model. Empirically, our
neural key phrase detection models significantly outperform an entity-tagging
baseline system. We demonstrate that the question generator formulates good
quality natural language questions from extracted key phrases. The resulting
questions and answers can be used to assess reading comprehension in
educational settings.
Idea density for predicting Alzheimer's disease from transcribed speech
Comments: CoNLL 2017
Subjects:
Computation and Language (cs.CL)
Idea Density (ID) measures the rate at which ideas or elementary predications
are expressed in an utterance or in a text. Lower ID is found to be associated
with an increased risk of developing Alzheimer’s disease (AD) (Snowdon et al.,
1996; Engelman et al., 2010). ID has been used in two different versions:
propositional idea density (PID) counts the expressed ideas and can be applied
to any text while semantic idea density (SID) counts pre-defined information
content units and is naturally more applicable to normative domains, such as
picture description tasks. In this paper, we develop DEPID, a novel
dependency-based method for computing PID, and its version DEPID-R that enables
to exclude repeating ideas—a feature characteristic to AD speech. We conduct
the first comparison of automatically extracted PID and SID in the diagnostic
classification task on two different AD datasets covering both closed-topic and
free-recall domains. While SID performs better on the normative dataset, adding
PID leads to a small but significant improvement (+1.7 F-score). On the
free-topic dataset, PID performs better than SID as expected (77.6 vs 72.3 in
F-score) but adding the features derived from the word embedding clustering
underlying the automatic SID increases the results considerably, leading to an
F-score of 84.8.
Fine-grained human evaluation of neural versus phrase-based machine translation
Comments: 12 pages, 2 figures, The Prague Bulletin of Mathematical Linguistics
Journal-ref: The Prague Bulletin of Mathematical Linguistics No. 108, pp.
121-132 (2017)
Subjects:
Computation and Language (cs.CL)
We compare three approaches to statistical machine translation (pure
phrase-based, factored phrase-based and neural) by performing a fine-grained
manual evaluation via error annotation of the systems’ outputs. The error types
in our annotation are compliant with the multidimensional quality metrics
(MQM), and the annotation is performed by two annotators. Inter-annotator
agreement is high for such a task, and results show that the best performing
system (neural) reduces the errors produced by the worst system (phrase-based)
by 54%.
Transfer Learning for Neural Semantic Parsing
Comments: Accepted for ACL Repl4NLP 2017
Subjects:
Computation and Language (cs.CL)
; Learning (cs.LG)
The goal of semantic parsing is to map natural language to a machine
interpretable meaning representation language (MRL). One of the constraints
that limits full exploration of deep learning technologies for semantic parsing
is the lack of sufficient annotation training data. In this paper, we propose
using sequence-to-sequence in a multi-task setup for semantic parsing with a
focus on transfer learning. We explore three multi-task architectures for
sequence-to-sequence modeling and compare their performance with an
independently trained model. Our experiments show that the multi-task setup
aids transfer learning from an auxiliary task with large labeled data to a
target task with smaller labeled data. We see absolute accuracy gains ranging
from 1.0% to 4.4% in our in- house data set, and we also see good gains ranging
from 2.5% to 7.0% on the ATIS semantic parsing tasks with syntactic and
semantic auxiliary tasks.
Identifying Condition-Action Statements in Medical Guidelines Using Domain-Independent Features
Hossein Hematialam , Wlodek Zadrozny Subjects : Computation and Language (cs.CL) ; Information Retrieval (cs.IR)
This paper advances the state of the art in text understanding of medical
guidelines by releasing two new annotated clinical guidelines datasets, and
establishing baselines for using machine learning to extract condition-action
pairs. In contrast to prior work that relies on manually created rules, we
report experiment with several supervised machine learning techniques to
classify sentences as to whether they express conditions and actions. We show
the limitations and possible extensions of this work on text mining of medical
guidelines.
Is Natural Language Strongly Nonergodic? A Stronger Theorem about Facts and Words
Comments: 20 pages, 1 figure
Subjects:
Information Theory (cs.IT)
; Computation and Language (cs.CL)
As we discuss, a stationary stochastic process is nonergodic when a random
persistent topic can be detected in the infinite random text sampled from the
process, whereas we call the process strongly nonergodic when an infinite
sequence of random bits, called random facts, is needed to describe this topic
completely. Whereas natural language has been often supposed to be nonergodic,
we exhibit some indirect evidence that natural language may be also strongly
nonergodic. First, we present a surprising assertion, which we call the theorem
about facts and words. This proposition states that the number of random facts
which can be inferred from a finite text sampled from a stationary process must
be roughly smaller than the number of word-like strings detected in this text
by the PPM compression algorithm. Second, we observe that the number of the
word-like strings for some texts in natural language follows an empirical
stepwise power law. In view of both observations, the number of inferrable
facts for natural language may also follow a power law, i.e., natural language
may be strongly nonergodic.
Adversarially Regularized Autoencoders for Generating Discrete Structures
Junbo (Jake)
Zhao , Yoon Kim , Kelly Zhang , Alexander M. Rush , Yann LeCun Subjects : Learning (cs.LG) ; Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE)
Generative adversarial networks are an effective approach for learning rich
latent representations of continuous data, but have proven difficult to apply
directly to discrete structured data, such as text sequences or discretized
images. Ideally we could encode discrete structures in a continuous code space
to avoid this problem, but it is difficult to learn an appropriate
general-purpose encoder. In this work, we consider a simple approach for
handling these two challenges jointly, employing a discrete structure
autoencoder with a code space regularized by generative adversarial training.
The model learns a smooth regularized code space while still being able to
model the underlying data, and can be used as a discrete GAN with the ability
to generate coherent discrete outputs from continuous samples. We demonstrate
empirically how key properties of the data are captured in the model’s latent
space, and evaluate the model itself on the tasks of discrete image generation,
text generation, and semi-supervised learning.
Distributed, Parallel, and Cluster Computing
Block-space GPU Mapping for Embedded Sierpiński Gasket Fractals
Comments: 7 pages, 8 Figures
Subjects:
Distributed, Parallel, and Cluster Computing (cs.DC)
This work studies the problem of GPU thread mapping for a Sierpi’nski gasket
fractal embedded in a discrete Euclidean space of (n imes n). A block-space
map (lambda: mathbb{Z}_{mathbb{E}}^{2} mapsto mathbb{Z}_{mathbb{F}}^{2})
is proposed, from Euclidean parallel space (mathbb{E}) to embedded fractal
space (mathbb{F}), that maps in (mathcal{O}(log_2 log_2(n))) time and uses
no more than (mathcal{O}(n^mathbb{H})) threads with (mathbb{H} approx
1.58…) being the Hausdorff dimension, making it parallel space efficient.
When compared to a bounding-box map, (lambda(omega)) offers a sub-exponential
improvement in parallel space and a monotonically increasing speedup once (n >
n_0). Experimental performance tests show that in practice (lambda(omega))
can produce performance improvement at any block-size once (n > n_0 = 2^8),
reaching approximately (10 imes) of speedup for (n=2^{16}) under optimal block
configurations.
Towards Adaptive Resilience in High Performance Computing
Comments: 2 pages, to be published in Proceedings of the Work in Progress Session held in connection with the 25th EUROMICRO International Conference on Parallel, Distributed and Network-based Processing, PDP 2017
Subjects:
Distributed, Parallel, and Cluster Computing (cs.DC)
; Performance (cs.PF)
Failure rates in high performance computers rapidly increase due to the
growth in system size and complexity. Hence, failures became the norm rather
than the exception. Different approaches on high performance computing (HPC)
systems have been introduced, to prevent failures (e. g., redundancy) or at
least minimize their impacts (e. g., checkpoint and restart). In most cases,
when these approaches are employed to increase the resilience of certain parts
of a system, energy consumption rapidly increases, or performance significantly
degrades. To address this challenge, we propose on-demand resilience as an
approach to achieve adaptive resilience in HPC systems. In this work, the HPC
system is considered in its entirety and resilience mechanisms such as
checkpointing, isolation, and migration, are activated on-demand. Using the
proposed approach, the unavoidable increase in total energy consumption and
system performance degradation is decreased compared to the typical
checkpoint/restart and redundant resilience mechanisms. Our work aims to
mitigate a large number of failures occurring at various layers in the system,
to prevent their propagation, and to minimize their impact, all of this in an
energy-saving manner. In the case of failures that are estimated to occur but
cannot be mitigated using the proposed on-demand resilience approach, the
system administrators will be notified in view of performing further
investigations into the causes of these failures and their impacts.
Anonymization of System Logs for Privacy and Storage Benefits
Comments: 8 pages, 5 figures, for demonstration see this https URL
Subjects:
Distributed, Parallel, and Cluster Computing (cs.DC)
; Cryptography and Security (cs.CR)
System logs constitute valuable information for analysis and diagnosis of
system behavior. The size of parallel computing systems and the number of their
components steadily increase. The volume of generated logs by the system is in
proportion to this increase. Hence, long-term collection and storage of system
logs is challenging. The analysis of system logs requires advanced text
processing techniques. For very large volumes of logs, the analysis is highly
time-consuming and requires a high level of expertise. For many parallel
computing centers, outsourcing the analysis of system logs to third parties is
the only affordable option. The existence of sensitive data within system log
entries obstructs, however, the transmission of system logs to third parties.
Moreover, the analytical tools for processing system logs and the solutions
provided by such tools are highly system specific. Achieving a more general
solution is only possible through the access and analysis system of logs of
multiple computing systems. The privacy concerns impede, however, the sharing
of system logs across institutions as well as in the public domain. This work
proposes a new method for the anonymization of the information within system
logs that employs de-identification and encoding to provide sharable system
logs, with the highest possible data quality and of reduced size. The results
presented in this work indicate that apart from eliminating the sensitive data
within system logs and converting them into shareable data, the proposed
anonymization method provides 25% performance improvement in post-processing of
the anonymized system logs, and more than 50% reduction in their required
storage space.
Runtime Verification for Business Processes Utilizing the Bitcoin Blockchain
Christoph Prybila , Stefan Schulte , Christoph Hochreiner , Ingo Weber Subjects : Software Engineering (cs.SE) ; Distributed, Parallel, and Cluster Computing (cs.DC)
The usage of process choreographies and decentralized Business Process
Management Systems has been named as an alternative to centralized business
process orchestration. In choreographies, control over a process instance is
shared between independent parties, and no party has full control or knowledge
during process runtime. Nevertheless, it is necessary to monitor and verify
process instances during runtime for purposes of documentation, accounting, or
compensation.
To achieve business process runtime verification, this work explores the
suitability of the Bitcoin blockchain to create a novel solution for
choreographies. The resulting approach is realized in a fully-functional
software prototype. This software solution is evaluated in a qualitative
comparison. Findings show that our blockchain-based approach enables a seamless
execution monitoring and verification of choreographies, while at the same time
preserving anonymity and independence of the process participants. Furthermore,
the prototype is evaluated in a performance analysis.
A Hybrid Observer for a Distributed Linear System with a Changing Neighbor Graph
Comments: 7 pages, the 56th IEEE Conference on Decision and Control
Subjects:
Systems and Control (cs.SY)
; Distributed, Parallel, and Cluster Computing (cs.DC)
A hybrid observer is described for estimating the state of an (m>0) channel,
(n)-dimensional, continuous-time, distributed linear system of the form
(dot{x} = Ax,;y_i = C_ix,;iin{1,2,ldots, m}). The system’s state (x) is
simultaneously estimated by (m) agents assuming each agent (i) senses (y_i) and
receives appropriately defined data from each of its current neighbors.
Neighbor relations are characterized by a time-varying directed graph
(mathbb{N}(t)) whose vertices correspond to agents and whose arcs depict
neighbor relations. Agent (i) updates its estimate (x_i) of (x) at “event
times” (t_1,t_2,ldots ) using a local observer and a local parameter
estimator. The local observer is a continuous time linear system whose input is
(y_i) and whose output (w_i) is an asymptotically correct estimate of (L_ix)
where (L_i) a matrix with kernel equaling the unobservable space of ((C_i,A)).
The local parameter estimator is a recursive algorithm designed to estimate,
prior to each event time (t_j), a constant parameter (p_j) which satisfies the
linear equations (w_k(t_j- au) =
L_kp_j+mu_k(t_j- au),;kin{1,2,ldots,m}), where ( au) is a small
positive constant and (mu_k) is the state estimation error of local observer
(k). Agent (i) accomplishes this by iterating its parameter estimator state
(z_i), (q) times within the interval ([t_j- au, t_j)), and by making use of
the state of each of its neighbors’ parameter estimators at each iteration. The
updated value of (x_i) at event time (t_j) is then (x_i(t_j) =
e^{A au}z_i(q)). Subject to the assumptions that (i) the neighbor graph
(mathbb{N}(t)) is strongly connected for all time, (ii) the system whose state
is to be estimated is jointly observable, (iii) (q) is sufficiently large, it
is shown that each estimate (x_i) converges to (x) exponentially fast as
(t
ightarrow infty) at a rate which can be controlled.
Learning
Provable benefits of representation learning
Comments: 22 pages
Subjects:
Learning (cs.LG)
; Machine Learning (stat.ML)
There is general consensus that learning representations is useful for a
variety of reasons, e.g. efficient use of labeled data (semi-supervised
learning), transfer learning and understanding hidden structure of data.
Popular techniques for representation learning include clustering, manifold
learning, kernel-learning, autoencoders, Boltzmann machines, etc.
To study the relative merits of these techniques, it’s essential to formalize
the definition and goals of representation learning, so that they are all
become instances of the same definition. This paper introduces such a formal
framework that also formalizes the utility of learning the representation. It
is related to previous Bayesian notions, but with some new twists. We show the
usefulness of our framework by exhibiting simple and natural settings — linear
mixture models and loglinear models, where the power of representation learning
can be formally shown. In these examples, representation learning can be
performed provably and efficiently under plausible assumptions (despite being
NP-hard), and furthermore: (i) it greatly reduces the need for labeled data
(semi-supervised learning) and (ii) it allows solving classification tasks when
simpler approaches like nearest neighbors require too much data (iii) it is
more powerful than manifold learning methods.
On Calibration of Modern Neural Networks
Comments: Accepted to ICML 2017
Subjects:
Learning (cs.LG)
Confidence calibration — the problem of predicting probability estimates
representative of the true correctness likelihood — is important for
classification models in many applications. We discover that modern neural
networks, unlike those from a decade ago, are poorly calibrated. Through
extensive experiments, we observe that depth, width, weight decay, and Batch
Normalization are important factors influencing calibration. We evaluate the
performance of various post-processing calibration methods on state-of-the-art
architectures with image and document classification datasets. Our analysis and
experiments not only offer insights into neural network learning, but also
provide a simple and straightforward recipe for practical settings: on most
datasets, temperature scaling — a single-parameter variant of Platt Scaling —
is surprisingly effective at calibrating predictions.
SEARNN: Training RNNs with Global-Local Losses
Comments: 12 pages
Subjects:
Learning (cs.LG)
; Machine Learning (stat.ML)
We propose SEARNN, a novel training algorithm for recurrent neural networks
(RNNs) inspired by the “learning to search” (L2S) approach to structured
prediction. RNNs have been widely successful in structured prediction
applications such as machine translation or parsing, and are commonly trained
using maximum likelihood estimation (MLE). Unfortunately, this training loss is
not always an appropriate surrogate for the test error: by only maximizing the
ground truth probability, it fails to exploit the wealth of information offered
by structured losses. Further, it introduces discrepancies between training and
predicting (such as exposure bias) that may hurt test performance. Instead,
SEARNN leverages test-alike search space exploration to introduce global-local
losses that are closer to the test error. We demonstrate improved performance
over MLE on three different tasks: OCR, spelling correction and text chunking.
Finally, we propose a subsampling strategy to enable SEARNN to scale to large
vocabulary sizes.
Empirical Analysis of the Hessian of Over-Parametrized Neural Networks
Levent Sagun , Utku Evci , V. Ugur Guney , Yann Dauphin , Leon Bottou Subjects : Learning (cs.LG)
We study the properties of common loss surfaces through their Hessian matrix.
In particular, in the context of deep learning, we empirically show that the
spectrum of the Hessian is composed of two parts: (1) the bulk centered near
zero, (2) and outliers away from the bulk. We present numerical evidence and
mathematical justifications to the following conjectures laid out by Sagun et.
al. (2016): Fixing data, increasing the number of parameters merely scales the
bulk of the spectrum; fixing the dimension and changing the data (for instance
adding more clusters or making the data less separable) only affects the
outliers. We believe that our observations have striking implications for
non-convex optimization in high dimensions. First, the flatness of such
landscapes (which can be measured by the singularity of the Hessian) implies
that classical notions of basins of attraction may be quite misleading. And
that the discussion of wide/narrow basins may be in need of a new perspective
around over-parametrization and redundancy that are able to create large
connected components at the bottom of the landscape. Second, the dependence of
small number of large eigenvalues to the data distribution can be linked to the
spectrum of the covariance matrix of gradients of model outputs. With this in
mind, we may reevaluate the connections within the data-architecture-algorithm
framework of a model, hoping that it would shed light into the geometry of
high-dimensional and non-convex spaces in modern applications. In particular,
we present a case that links the two observations: a gradient based method
appears to be first climbing uphill and then falling downhill between two
points; whereas, in fact, they lie in the same basin.
A survey of dimensionality reduction techniques based on random projection
Haozhe Xie , Jie Li , Hanqing Xue Subjects : Learning (cs.LG)
Dimensionality reduction techniques play important roles in the analysis of
big data. Traditional dimensionality reduction approaches, such as Principle
Component Analysis (PCA) and Linear Discriminant Analysis (LDA), have been
studied extensively in the past few decades. However, as the dimension of huge
data increases, the computational cost of traditional dimensionality reduction
approaches grows dramatically and becomes prohibitive. It has also triggered
the development of Random Projection (RP) technique which maps high-dimensional
data onto low-dimensional subspace within short time. However, RP generates
transformation matrix without considering intrinsic structure of original data
and usually leads to relatively high distortion. Therefore, in the past few
years, some approaches based on RP have been proposed to address this problem.
In this paper, we summarized these approaches in different applications to help
practitioners to employ proper approaches in their specific applications. Also,
we enumerated their benefits and limitations to provide further references for
researchers to develop novel RP-based approaches.
Dueling Bandits With Weak Regret
Bangrui Chen , Peter I. Frazier Subjects : Learning (cs.LG)
We consider online content recommendation with implicit feedback through
pairwise comparisons, formalized as the so-called dueling bandit problem. We
study the dueling bandit problem in the Condorcet winner setting, and consider
two notions of regret: the more well-studied strong regret, which is 0 only
when both arms pulled are the Condorcet winner; and the less well-studied weak
regret, which is 0 if either arm pulled is the Condorcet winner. We propose a
new algorithm for this problem, Winner Stays (WS), with variations for each
kind of regret: WS for weak regret (WS-W) has expected cumulative weak regret
that is (O(N^2)), and (O(Nlog(N))) if arms have a total order; WS for strong
regret (WS-S) has expected cumulative strong regret of (O(N^2 + N log(T))),
and (O(Nlog(N)+Nlog(T))) if arms have a total order. WS-W is the first
dueling bandit algorithm with weak regret that is constant in time. WS is
simple to compute, even for problems with many arms, and we demonstrate through
numerical experiments on simulated and real data that WS has significantly
smaller regret than existing algorithms in both the weak- and strong-regret
settings.
Transfer entropy-based feedback improves performance in artificial neural networks
Sebastian Herzog , Christian Tetzlaff , Florentin Wörgötter Subjects : Learning (cs.LG) ; Information Theory (cs.IT); Neural and Evolutionary Computing (cs.NE)
The structure of the majority of modern deep neural networks is characterized
by uni- directional feed-forward connectivity across a very large number of
layers. By contrast, the architecture of the cortex of vertebrates contains
fewer hierarchical levels but many recurrent and feedback connections. Here we
show that a small, few-layer artificial neural network that employs feedback
will reach top level performance on a standard benchmark task, otherwise only
obtained by large feed-forward structures. To achieve this we use feed-forward
transfer entropy between neurons to structure feedback connectivity. Transfer
entropy can here intuitively be understood as a measure for the relevance of
certain pathways in the network, which are then amplified by feedback. Feedback
may therefore be key for high network performance in small brain-like
architectures.
Adversarially Regularized Autoencoders for Generating Discrete Structures
Junbo (Jake)
Zhao , Yoon Kim , Kelly Zhang , Alexander M. Rush , Yann LeCun Subjects : Learning (cs.LG) ; Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE)
Generative adversarial networks are an effective approach for learning rich
latent representations of continuous data, but have proven difficult to apply
directly to discrete structured data, such as text sequences or discretized
images. Ideally we could encode discrete structures in a continuous code space
to avoid this problem, but it is difficult to learn an appropriate
general-purpose encoder. In this work, we consider a simple approach for
handling these two challenges jointly, employing a discrete structure
autoencoder with a code space regularized by generative adversarial training.
The model learns a smooth regularized code space while still being able to
model the underlying data, and can be used as a discrete GAN with the ability
to generate coherent discrete outputs from continuous samples. We demonstrate
empirically how key properties of the data are captured in the model’s latent
space, and evaluate the model itself on the tasks of discrete image generation,
text generation, and semi-supervised learning.
Hybrid Reward Architecture for Reinforcement Learning
Harm van Seijen , Mehdi Fatemi , Joshua Romoff , Romain Laroche , Tavian Barnes , Jeffrey Tsang Subjects : Learning (cs.LG)
One of the main challenges in reinforcement learning (RL) is generalisation.
In typical deep RL methods this is achieved by approximating the optimal value
function with a low-dimensional representation using a deep network. While this
approach works well in many domains, in domains where the optimal value
function cannot easily be reduced to a low-dimensional representation, learning
can be very slow and unstable. This paper contributes towards tackling such
challenging domains, by proposing a new method, called Hybrid Reward
Architecture (HRA). HRA takes as input a decomposed reward function and learns
a separate value function for each component reward function. Because each
component typically only depends on a subset of all features, the overall value
function is much smoother and can be easier approximated by a low-dimensional
representation, enabling more effective learning. We demonstrate HRA on a
toy-problem and the Atari game Ms. Pac-Man, where HRA achieves above-human
performance.
Deep Learning Methods for Efficient Large Scale Video Labeling
Comments: 7 pages, 5 tables, 1 figure
Subjects:
Machine Learning (stat.ML)
; Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
We present a solution to “Google Cloud and YouTube-8M Video Understanding
Challenge” that ranked 5th place. The proposed model is an ensemble of three
model families, two frame level and one video level. The training was performed
on augmented dataset, with cross validation.
Comments: 12 pages, submitted
Subjects:
Information Theory (cs.IT)
; Learning (cs.LG); Machine Learning (stat.ML)
We study the problem of throughput maximization by predicting spectrum
opportunities using reinforcement learning. Our kernel-based reinforcement
learning approach is coupled with a sparsification technique that efficiently
captures the environment states to control dimensionality and finds the best
possible channel access actions based on the current state. This approach
allows learning and planning over the intrinsic state-action space and extends
well to large state and action spaces. For stationary Markov environments, we
derive the optimal policy for channel access, its associated limiting
throughput, and propose a fast online algorithm for achieving the optimal
throughput. We then show that the maximum-likelihood channel prediction and
access algorithm is suboptimal in general, and derive conditions under which
the two algorithms are equivalent. For reactive Markov environments, we derive
kernel variants of Q-learning, R-learning and propose an accelerated R-learning
algorithm that achieves faster convergence. We finally test our algorithms
against a generic reactive network. Simulation results are shown to validate
the theory and show the performance gains over current state-of-the-art
techniques.
Transfer Learning for Neural Semantic Parsing
Comments: Accepted for ACL Repl4NLP 2017
Subjects:
Computation and Language (cs.CL)
; Learning (cs.LG)
The goal of semantic parsing is to map natural language to a machine
interpretable meaning representation language (MRL). One of the constraints
that limits full exploration of deep learning technologies for semantic parsing
is the lack of sufficient annotation training data. In this paper, we propose
using sequence-to-sequence in a multi-task setup for semantic parsing with a
focus on transfer learning. We explore three multi-task architectures for
sequence-to-sequence modeling and compare their performance with an
independently trained model. Our experiments show that the multi-task setup
aids transfer learning from an auxiliary task with large labeled data to a
target task with smaller labeled data. We see absolute accuracy gains ranging
from 1.0% to 4.4% in our in- house data set, and we also see good gains ranging
from 2.5% to 7.0% on the ATIS semantic parsing tasks with syntactic and
semantic auxiliary tasks.
Teaching Compositionality to CNNs
Comments: Preprint appearing in CVPR 2017
Subjects:
Computer Vision and Pattern Recognition (cs.CV)
; Learning (cs.LG)
Convolutional neural networks (CNNs) have shown great success in computer
vision, approaching human-level performance when trained for specific tasks via
application-specific loss functions. In this paper, we propose a method for
augmenting and training CNNs so that their learned features are compositional.
It encourages networks to form representations that disentangle objects from
their surroundings and from each other, thereby promoting better
generalization. Our method is agnostic to the specific details of the
underlying CNN to which it is applied and can in principle be used with any
CNN. As we show in our experiments, the learned representations lead to feature
activations that are more localized and improve performance over
non-compositional baselines in object recognition tasks.
Leveraging Node Attributes for Incomplete Relational Data
Comments: Appearing in ICML 2017
Subjects:
Machine Learning (stat.ML)
; Learning (cs.LG); Social and Information Networks (cs.SI)
Relational data are usually highly incomplete in practice, which inspires us
to leverage side information to improve the performance of community detection
and link prediction. This paper presents a Bayesian probabilistic approach that
incorporates various kinds of node attributes encoded in binary form in
relational models with Poisson likelihood. Our method works flexibly with both
directed and undirected relational networks. The inference can be done by
efficient Gibbs sampling which leverages sparsity of both networks and node
attributes. Extensive experiments show that our models achieve the
state-of-the-art link prediction results, especially with highly incomplete
relational data.
Optimization by a quantum reinforcement algorithm
Comments: 11 pages, 5 figures
Subjects:
Disordered Systems and Neural Networks (cond-mat.dis-nn)
; Statistical Mechanics (cond-mat.stat-mech); Artificial Intelligence (cs.AI); Learning (cs.LG); Quantum Physics (quant-ph)
A reinforcement algorithm solves a classical optimization problem by
introducing a feedback to the system which slowly changes the energy landscape
and converges the algorithm to an optimal solution in the configuration space.
Here, we use this strategy to concentrate (localize) preferentially the wave
function of a quantum particle, which explores the configuration space of the
problem, on an optimal configuration. We examine the method by solving
numerically the equations governing the evolution of the system, which are
similar to the nonlinear Schr”odinger equations, for small problem sizes. In
particular, we observe that reinforcement increases the minimal energy gap of
the system in a quantum annealing algorithm. Our numerical simulations and the
latter observation show that such kind of quantum feedbacks might be helpful in
solving a computationally hard optimization problem by a quantum reinforcement
algorithm.
On Optimistic versus Randomized Exploration in Reinforcement Learning
Comments: Extended abstract for RLDM 2017
Subjects:
Machine Learning (stat.ML)
; Learning (cs.LG)
We discuss the relative merits of optimistic and randomized approaches to
exploration in reinforcement learning. Optimistic approaches presented in the
literature apply an optimistic boost to the value estimate at each state-action
pair and select actions that are greedy with respect to the resulting
optimistic value function. Randomized approaches sample from among
statistically plausible value functions and select actions that are greedy with
respect to the random sample. Prior computational experience suggests that
randomized approaches can lead to far more statistically efficient learning. We
present two simple analytic examples that elucidate why this is the case. In
principle, there should be optimistic approaches that fare well relative to
randomized approaches, but that would require intractable computation.
Optimistic approaches that have been proposed in the literature sacrifice
statistical efficiency for the sake of computational efficiency. Randomized
approaches, on the other hand, may enable simultaneous statistical and
computational efficiency.
Information Theory
Comments: 12 pages, submitted
Subjects:
Information Theory (cs.IT)
; Learning (cs.LG); Machine Learning (stat.ML)
We study the problem of throughput maximization by predicting spectrum
opportunities using reinforcement learning. Our kernel-based reinforcement
learning approach is coupled with a sparsification technique that efficiently
captures the environment states to control dimensionality and finds the best
possible channel access actions based on the current state. This approach
allows learning and planning over the intrinsic state-action space and extends
well to large state and action spaces. For stationary Markov environments, we
derive the optimal policy for channel access, its associated limiting
throughput, and propose a fast online algorithm for achieving the optimal
throughput. We then show that the maximum-likelihood channel prediction and
access algorithm is suboptimal in general, and derive conditions under which
the two algorithms are equivalent. For reactive Markov environments, we derive
kernel variants of Q-learning, R-learning and propose an accelerated R-learning
algorithm that achieves faster convergence. We finally test our algorithms
against a generic reactive network. Simulation results are shown to validate
the theory and show the performance gains over current state-of-the-art
techniques.
On Error Detection in Asymmetric Channels
Comments: 4 pages, 2 figures
Subjects:
Information Theory (cs.IT)
; Discrete Mathematics (cs.DM)
We study the error detection problem in ( q )-ary asymmetric channels wherein
every input symbol ( x_i ) is mapped to an output symbol ( y_i ) satisfying (
y_i geq x_i ). A general setting is assumed where the noise vectors are
(potentially) restricted in: 1) the amplitude, ( y_i – x_i leq a ), 2) the
Hamming weight, ( sum_{i=1}^n 1_{{y_i
eq x_i}} leq h ), and 3) the total
weight, ( sum_{i=1}^n (y_i – x_i) leq t ). Optimal codes detecting these
types of errors are described for certain sets of parameters ( a, h, t ), both
in the standard and in the cyclic (( operatorname{mod}, q )) version of the
problem. It is also demonstrated that these codes are optimal in the large
alphabet limit for every ( a, h, t ) and every block-length ( n ).
Comments: 8 pages
Subjects:
Information Theory (cs.IT)
In this paper, we consider a point-to-point link between an energy harvesting
transmitter and receiver, where neither node has the information about the
battery state or energy availability at the other node. We consider a model
where data is successfully delivered only in slots where both nodes are active.
Energy loss occurs whenever one node turns on while the other node is in sleep
mode. In each slot, based on their own energy availability, the transmitter and
receiver need to independently decide whether or not to turn on, with the aim
of maximizing the long-term time-average throughput. We present an upper bound
on the throughput achievable by analyzing a genie-aided system that has
noncausal knowledge of the energy arrivals at both the nodes. Next, we propose
an online policy requiring an occasional one-bit feedback whose throughput is
within one bit of the upper bound, asymptotically in the battery size. In order
to further reduce the feedback required, we propose a time-dilated version of
the online policy. As the time dilation gets large, this policy does not
require any feedback and achieves the upper bound asymptotically in the battery
size. Inspired by this, we also propose a near-optimal fully uncoordinated
policy. We use Monte Carlo simulations to validate our theoretical results and
illustrate the performance of the proposed policies.
Is Natural Language Strongly Nonergodic? A Stronger Theorem about Facts and Words
Comments: 20 pages, 1 figure
Subjects:
Information Theory (cs.IT)
; Computation and Language (cs.CL)
As we discuss, a stationary stochastic process is nonergodic when a random
persistent topic can be detected in the infinite random text sampled from the
process, whereas we call the process strongly nonergodic when an infinite
sequence of random bits, called random facts, is needed to describe this topic
completely. Whereas natural language has been often supposed to be nonergodic,
we exhibit some indirect evidence that natural language may be also strongly
nonergodic. First, we present a surprising assertion, which we call the theorem
about facts and words. This proposition states that the number of random facts
which can be inferred from a finite text sampled from a stationary process must
be roughly smaller than the number of word-like strings detected in this text
by the PPM compression algorithm. Second, we observe that the number of the
word-like strings for some texts in natural language follows an empirical
stepwise power law. In view of both observations, the number of inferrable
facts for natural language may also follow a power law, i.e., natural language
may be strongly nonergodic.
Strong converse bounds for high-dimensional estimation
Ramji Venkataramanan , Oliver Johnson Subjects : Information Theory (cs.IT) ; Statistics Theory (math.ST); Machine Learning (stat.ML)
In statistical inference problems, we wish to obtain lower bounds on the
minimax risk, that is to bound the performance of any possible estimator. A
standard technique to obtain risk lower bounds involves the use of Fano’s
inequality. In an information-theoretic setting, it is known that Fano’s
inequality typically does not give a sharp converse result (error lower bound)
for channel coding problems. Moreover, recent work has shown that an argument
based on binary hypothesis testing gives tighter results. We adapt this
technique to the statistical setting, and argue that Fano’s inequality can
always be replaced by this approach to obtain tighter lower bounds that can be
easily computed and are asymptotically sharp. We illustrate our technique in
three applications: density estimation, active learning of a binary classifier,
and compressed sensing, obtaining tighter risk lower bounds in each case.
Sequential Channel Estimation in the Presence of Random Phase Noise in NB-IoT Systems
Comments: 5 pages, 4 figures, submitted to conference
Subjects:
Information Theory (cs.IT)
We consider channel estimation (CE) in narrowband Internet-of-Things (NB-IoT)
systems. Due to the fluctuations in phase within receiver and transmitter
oscillators, and also the residual frequency offset (FO) caused by
discontinuous receiving of repetition coded transmit data-blocks, random phase
noises are presented in received signals. Although the coherent-time of fading
channel can be assumed fairly long due to the low-mobility of NB-IoT
user-equipments (UEs), such phase noises have to be considered before combining
the the channel estimates over repetition copies to improve their accuracies.
In this paper, we derive a sequential minimum-mean-square-error (MMSE) channel
estimator in the presence of random phase noise that refines the CE
sequentially with each received repetition copy, which has a low-complexity and
a small data storage. Further, we show through simulations that, the proposed
sequential MMSE estimator improves the mean-square-error (MSE) of CE by 1 dB in
the low signal-to-noise ratio (SNR) regime, compared to a traditional
sequential MMSE estimator that does not thoroughly consider the impact of
random phase noises.
WLS-Based Self-Localization Using Perturbed Anchor Positions and RSSI Measurements
Vikram Kumar , Reza Arablouei , Brano Kusy , Raja Jurdak , Neil W. Bergmann Subjects : Information Theory (cs.IT)
We consider the problem of self-localization by a resource-constrained node
within a network given radio signal strength indicator (RSSI) measurements from
a set of anchor nodes where the RSSI measurements as well as the anchor
position information are subject to perturbation. In order to achieve a
computationally efficient estimate for the unknown position, we minimize a
weighted sum-square-distance-error cost function in an iterative fashion
utilizing the gradient-descent method. We calculate the weights in the cost
function by taking into account perturbations in both RSSI measurements and
anchor node position information while assuming normal distribution for the
perturbations in the anchor node position information and log-normal
distribution for the RSSI-induced distance estimates. The latter assumption is
due to considering the log-distance path-loss model with normally-distributed
perturbations for the RSSI measurements in the logarithmic scale. We also
derive the Cramer-Rao lower bound associated with the considered position
estimation problem. We evaluate the performance of the proposed algorithm
considering various arbitrary network topologies and compare it with an
existing algorithm that is based on a similar approach but only accounts for
perturbations in the RSSI measurements. The experimental results show that the
proposed algorithm yields significant improvement in localization performance
over the existing algorithm while maintaining its computational efficiency.
This makes the proposed algorithm suitable for real-world applications where
the information available about the positions of anchor nodes often suffer from
uncertainty due to observational noise or error and the computational and
energy resources of mobile nodes are limited, prohibiting the use of more
sophisticated techniques such as those based on semidefinite or second-order
cone programming.
Compressed Secret Key Agreement: Maximizing Multivariate Mutual Information Per Bit
Chung Chan Subjects : Information Theory (cs.IT)
The multiterminal secret key agreement problem by public discussion is
revisited with an additional source compression step. Prior to public
discussion, users independently compress their private sources to filter out
less correlated information that adds little to the maximum achievable secret
key rate, referred to as the compressed secrecy capacity. The characterization
of this secrecy capacity posts new challenges in information processing and
dimension reduction, and the idea has given rise to one of the best achieving
schemes for secret key agreement under public discussion rate constraints.
Exploiting such connection, we derive single-letter lower and upper bounds for
a general source model, and a precise single-letter characterization for the
pairwise independent network model.
Quantifying genuine multipartite correlations and their pattern complexity
Comments: 4 pages
Subjects:
Quantum Physics (quant-ph)
; Disordered Systems and Neural Networks (cond-mat.dis-nn); Information Theory (cs.IT); Adaptation and Self-Organizing Systems (nlin.AO); Data Analysis, Statistics and Probability (physics.data-an)
We propose an information-theoretic framework to quantify multipartite
correlations in classical and quantum systems, answering questions such as:
what is the amount of seven-partite correlations in a given state of ten
particles? We identify measures of genuine multipartite correlations, i.e.
statistical dependencies which cannot be ascribed to bipartite correlations,
satisfying a set of desirable properties. Inspired by ideas developed in
complexity science, we then introduce the concept of weaving to classify states
with an equal amount of total correlations, but displaying different patterns
of multipartite correlations. The weaving of a state is defined as the weighted
sum of correlations of any order. Weaving measures are good descriptors of the
complexity of correlation structures in multipartite systems.
Transfer entropy-based feedback improves performance in artificial neural networks
Sebastian Herzog , Christian Tetzlaff , Florentin Wörgötter Subjects : Learning (cs.LG) ; Information Theory (cs.IT); Neural and Evolutionary Computing (cs.NE)
The structure of the majority of modern deep neural networks is characterized
by uni- directional feed-forward connectivity across a very large number of
layers. By contrast, the architecture of the cortex of vertebrates contains
fewer hierarchical levels but many recurrent and feedback connections. Here we
show that a small, few-layer artificial neural network that employs feedback
will reach top level performance on a standard benchmark task, otherwise only
obtained by large feed-forward structures. To achieve this we use feed-forward
transfer entropy between neurons to structure feedback connectivity. Transfer
entropy can here intuitively be understood as a measure for the relevance of
certain pathways in the network, which are then amplified by feedback. Feedback
may therefore be key for high network performance in small brain-like
architectures.
欢迎加入我爱机器学习QQ11群:191401275
微信扫一扫,关注我爱机器学习公众号
微博:我爱机器学习
以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
The Art of Computer Programming, Volume 4, Fascicle 3
Donald E. Knuth / Addison-Wesley Professional / 2005-08-05 / USD 19.99
Finally, after a wait of more than thirty-five years, the first part of Volume 4 is at last ready for publication. Check out the boxed set that brings together Volumes 1 - 4A in one elegant case, and ......一起来看看 《The Art of Computer Programming, Volume 4, Fascicle 3》 这本书的介绍吧!