info@itechprosolutions.in | +91 9790176891

MAT LAB 2014 Projects

Category Archives

Phase-Based Binarization of Ancient Document Images: Model and Applications

ABSTRACT:

In this paper, a phase-based binarization model for ancient document images is proposed, as well as a postprocessing method that can improve any binarization method and a ground truth generation tool. Three feature maps derived from the phase information of an input document image constitute the core of this binarization model. These features are the maximum moment of phase congruency covariance, a locally weighted mean phase angle, and a phase preserved denoised image. The proposed model consists of three standard steps: 1) preprocessing; 2) main binarization; and 3) postprocessing. In the preprocessing and main binarization steps, the features used are mainly phase derived, while in the postprocessing step, specialized adaptive Gaussian and median filters are considered. One of the outputs of the binarization step, which shows high recall performance, is used in a proposed postprocessing method to improve the performance of other binarization methodologies. Finally, we develop a ground truth generation tool, called PhaseGT, to simplify and speed up the ground truth generation process for ancient document images. The comprehensive experimental results on the DIBCO’09, H-DIBCO’10, DIBCO’11, H-DIBCO’12, DIBCO’13, PHIBD’12, and BICKLEY DIARY data sets show the robustness of the proposed binarization method on various types of degradation and document images.

DOWNLOAD


Characterness: An Indicator of Text in the Wild

ABSTRACT:

Text in an image provides vital information for interpreting its contents, and text in a scene can aid a variety of tasks from navigation to obstacle avoidance and odometry. Despite its value, however, detecting general text in images remains a challenging research problem. Motivated by the need to consider the widely varying forms of natural text, we propose a bottom-up approach to the problem, which reflects the characterness of an image region. In this sense, our approach mirrors the move from saliency detection methods to measures of objectness. In order to measure the characterness, we develop three novel cues that are tailored for character detection and a Bayesian method for their integration. Because text is made up of sets of characters, we then design a Markov random field model so as to exploit the inherent dependencies between characters. We experimentally demonstrate the effectiveness of our characterness cues as well as the advantage of Bayesian multicue integration. The proposed text detector outperforms state-of-the-art methods on a few benchmark scene text detection data sets. We also show that our measurement of characterness is superior than state-of-the-art saliency detection models when applied to the same task.

DOWNLOAD


A Unified Data Embedding and Scrambling Method

ABSTRACT:

Conventionally, data embedding techniques aim at maintaining high-output image quality so that the difference between the original and the embedded images is imperceptible to the naked eye. Recently, as a new trend, some researchers exploited reversible data embedding techniques to deliberately degrade image quality to a desirable level of distortion. In this paper, a unified data embedding-scrambling technique called UES is proposed to achieve two objectives simultaneously, namely, high payload and adaptive scalable quality degradation. First, a pixel intensity value prediction method called checkerboard-based prediction is proposed to accurately predict 75% of the pixels in the image based on the information obtained from 25% of the image. Then, the locations of the predicted pixels are vacated to embed information while degrading the image quality. Given a desirable quality (quantified in SSIM) for the output image, UES guides the embedding-scrambling algorithm to handle the exact number of pixels, i.e., the perceptual quality of

the embedded-scrambled image can be controlled. In addition, the prediction errors are stored at a predetermined precision using the structure side information to perfectly reconstruct or approximate the original image. In particular, given a desirable SSIM value, the precision of the stored prediction errors can be adjusted to control the perceptual quality of the reconstructed image. Experimental results confirmed that UES is able to perfectly reconstruct or approximate the original image with SSIM value >0.99 after completely degrading its perceptual quality while embedding at 7.001bpp on average.

DOWNLOAD


Scene Text Recognition in Mobile Applications by Character Descriptor and Structure Configuration

ABSTRACT:

Text characters and strings in natural scene can provide valuable information for many applications. Extracting text directly from natural scene images or videos is a challenging task because of diverse text patterns and variant background interferences. This paper proposes a method of scene text recognition from detected text regions. In text detection, our previously proposed algorithms are applied to obtain text regions  from scene image. First, we design a discriminative character descriptor by combining several state-of-the-art feature detectors and descriptors. Second, we model character structure at each character class by designing stroke configuration maps. Our algorithm design is compatible with the application of scene text extraction in smart mobile devices. An Android-based demo system is developed to show the effectiveness of our proposed method on scene text information extraction from nearby objects. The demo system also provides us some insight into algorithm design and performance improvement of scene text extraction. The evaluation results on benchmark data sets demonstrate that our proposed scheme of text recognition is comparable with the best existing methods.

DOWNLOAD


Progressive Image Denoising Through Hybrid Graph Laplacian Regularization: A Unified Framework

ABSTRACT:

Recovering images from corrupted observations is necessary for many real-world applications. In this paper, we propose a unified framework to perform progressive image recovery based on hybrid graph Laplacian regularized regression. We first construct a multiscale representation of the target image by Laplacian pyramid, then progressively recover the degraded image in the scale space from coarse to fine so that the sharp edges and texture can be eventually recovered. On one hand, within each scale, a graph Laplacian regularization model represented by implicit kernel is learned, which simultaneously minimizes the least square error on the measured samples and preserves the geometrical structure of the image data space. In this procedure, the intrinsic manifold structure is explicitly considered using both measured and unmeasured samples, and the nonlocal self-similarity property is utilized as a fruitful resource for abstracting a priori knowledge of the images. On the other hand, between two successive scales, the proposed model is

extended to a projected high-dimensional feature space through explicit kernel mapping to describe the interscale correlation, in which the local structure regularity is learned and propagated from coarser to finer scales. In this way, the proposed algorithm gradually recovers more and more image details and edges, which could not been recovered in previous scale. We test our algorithm on one typical image recovery task: impulse noise removal. Experimental results on benchmark test images

demonstrate that the proposed method achieves better performance than state-of-the-art algorithms.

DOWNLOAD


Mining Weakly Labeled Web Facial Images for Search-Based Face Annotation

ABSTRACT:

This paper investigates a framework of search-based face annotation (SBFA) by mining weakly labeled facial images that are freely available on the World Wide Web (WWW). One challenging problem for search-based face annotation scheme is how to effectively perform annotation by exploiting the list of most similar facial images and their weak labels that are often noisy and incomplete. To tackle this problem, we propose an effective unsupervised label refinement (ULR) approach for refining the labels of web facial images using machine learning techniques. We formulate the learning problem as a convex optimization and develop effective optimization algorithms to solve the large-scale learning task efficiently. To further speed up the proposed scheme, we also propose a clustering-based approximation algorithm which can improve the scalability considerably. We have conducted an extensive set of empirical studies on a large-scale web facial image testbed, in which encouraging results showed that the proposed ULR algorithms can significantly boost the performance of the promising SBFA scheme.

DOWNLOAD


Low-Rank Neighbor Embedding for Single Image Super-Resolution

ABSTRACT:

This letter proposes a novel single image super-resolution (SR) method based on the low-rank matrix recovery (LRMR) and neighbor embedding (NE). LRMR is used to explore the underlying structures of subspaces spanned by similar patches. Specifically, the training patches are first divided into groups. Then the LRMR technique is utilized to learn the latent structure of each group. The NE algorithm is performed on the learnt low-rank components of HR and LR patches to produce SR results. Experimental results suggest that our approach can reconstruct high quality images both quantitatively and perceptually.

DOWNLOAD


Images as Occlusions of Textures: A Framework for Segmentation

ABSTRACT:

We propose a new mathematical and algorithmic framework for unsupervised image segmentation, which is a critical step in a wide variety of image processing applications. We have found that most existing segmentation methods are not successful on histopathology images, which prompted us to investigate segmentation of a broader class of images, namely those without clear edges between the regions to be segmented. We model these images as occlusions of random images, which we call textures, and show that local histograms are a useful tool for segmenting them. Based on our theoretical results, we describe a flexible segmentation framework that draws on existing work on nonnegative matrix factorization and image deconvolution. Results on synthetic texture mosaics and real histology images show the promise of the method.

DOWNLOAD


Hyperspectral Image Classification Through Bilayer Graph-Based Learning

ABSTRACT:

Hyperspectral image classification with limited number of labeled pixels is a challenging task. In this paper, we propose a bilayer graph-based learning framework to address this problem. For graph-based classification, how to establish the neighboring relationship among the pixels from the high dimensional features is the key toward a successful classification. Our graph learning algorithm contains two layers. The first-layer constructs a simple graph, where each vertex denotes one pixel and the edge weight encodes the similarity between two pixels. Unsupervised learning is then conducted to estimate the grouping relations among different pixels. These relations are subsequently fed into the second layer to form a hypergraph structure, on top of which, semisupervised transductive learning is conducted to obtain the final classification results. Our experiments on three data sets demonstrate the merits of our proposed approach, which compares favorably with state of the art.

DOWNLOAD


Fingerprint Compression Based on Sparse Representation

ABSTRACT:

A new fingerprint compression algorithm based on sparse representation is introduced. Obtaining an over complete dictionary from a set of fingerprint patches allows us to represent them as a sparse linear combination of dictionary atoms. In the algorithm, we first construct a dictionary for predefined fingerprint image patches. For a new given fingerprint images, represent its patches according to the dictionary by computing l0-minimization and then quantize and encode the representation. In this paper, we consider the effect of various factors on compression results. Three groups of fingerprint images are tested. The experiments demonstrate that our algorithm is efficient compared with several competing compression techniques (JPEG, JPEG 2000, andWSQ), especially at high compression ratios. The experiments also illustrate that the proposed algorithm is robust to extract minutiae.

DOWNLOAD.


Exposing Digital Image Forgeries by Illumination Color Classification

ABSTRACT:

For decades, photographs have been used to document space-time events and they have often served as evidence in courts. Although photographers are able to create composites of analog pictures, this process is very time consuming and requires expert knowledge. Today, however, powerful digital image editing software makes image modifications straightforward. This undermines our trust in photographs and, in particular, questions pictures as evidence for real-world events. In this paper, we analyze one of the most common forms of photographic manipulation, known as image composition or splicing. We propose a forgery detection method that exploits subtle inconsistencies in the color of the illumination of images. Our approach is machine-learning-based and requires minimal user interaction. The technique is applicable to images containing two or more people and requires no expert interaction for the tampering decision. To achieve this, we incorporate information from physics- and statistical-based illuminant estimators on image regions of similar material. From these illuminant estimates, we extract texture- and edge-based features which are then provided to a machine-learning approach for automatic decision-making. The classification performance using an SVM meta-fusion classifier is promising. It yields detection rates of 86% on a new benchmark dataset consisting of 200 images, and 83% on 50 images that were collected from the Internet.

DOWNLOAD.


Dual-Geometric Neighbor Embedding for Image Super Resolution With Sparse Tensor

ABSTRACT:

Neighbors embedding (NE) technology has proved its efficiency in single image super resolution (SISR). However, image patches do not strictly follow the similar structure in the low-resolution and high-resolution spaces, consequently leading to a bias to the image restoration. In this paper, considering that patches are a set of data with multiview characteristics and spatial organization, we advance a dual-geometric neighbor embedding (DGNE) approach for SISR. In DGNE, multiview features and local spatial neighbors of patches are explored to find a feature-spatial manifold embedding for images. We adopt a geometrically motivated assumption that for each patch there exists a small neighborhood in which only the patches that come from the same feature-spatial manifold, will lie approximately in a low-dimensional affine subspace formulated by sparse neighbors. In order to find the sparse neighbors, a tensor-simultaneous orthogonal matching pursuit algorithm is advanced to realize a joint sparse coding of feature-spatial image tensors. Some experiments are performed on realizing a 3X amplification of natural images, and the recovered results prove its efficiency and superiority to its counterparts.

DOWNLOAD.


Digital Image Sharing by Diverse Image Media

ABSTRACT:

Conventional visual secret sharing (VSS) schemes hide secret images in shares that are either printed on transparencies or are encoded and stored in a digital form. The shares can appear as noise-like pixels or as meaningful images; but it will arouse suspicion and increase interception risk during transmission of the shares. Hence, VSS schemes suffer from a transmission risk problem for the secret itself and for the participants who are involved in the VSS scheme. To address this problem, we proposed a natural-image-based VSS scheme (NVSS scheme) that shares secret images via various carrier media to protect the secret and the participants during the transmission phase. The proposed (n, n) – NVSS scheme can share one digital secret image over n 1 arbitrary selected natural images (called natural shares) and one noise-like share. The natural shares can be photos or hand-painted pictures in digital form or in printed form. The noise-like share is generated based on these natural shares and the secret image. The unaltered natural shares are diverse and innocuous, thus greatly reducing the transmission risk problem. We also propose possible ways to hide the noise-like share to reduce the transmission risk problem for the share. Experimental results indicate that the proposed approach is an excellent solution for solving the transmission risk problem for the VSS schemes.

DOWNLOAD.


Designing an Efficient Image Encryption-Then-Compression System via Prediction Error Clustering and Random Permutation

ABSTRACT:

In many practical scenarios, image encryption has to be conducted prior to image compression. This has led to the problem of how to design a pair of image encryption and compression algorithms such that compressing the encrypted images can still be efficiently performed. In this paper, we design a highly efficient image encryption-then-compression (ETC) system, where both lossless and lossy compression are considered. The proposed image encryption scheme operated in the prediction error domain is shown to be able to provide a reasonably high level of security. We also demonstrate that an arithmetic coding-based approach can be exploited to efficiently compress the encrypted images. More notably, the proposed compression approach applied to encrypted images is only slightly worse, in terms of compression efficiency, than the state-of-the-art lossless/lossy image coders, which take original, unencrypted images as inputs. In contrast, most of the existing ETC solutions induce significant penalty on the compression efficiency.

DOWNLOAD.


BRINT: Binary Rotation Invariant and Noise Tolerant Texture Classification

ABSTRACT:

In this paper, we propose a simple, efficient, yet robust multiresolution approach to texture classification—binary rotation invariant and noise tolerant (BRINT). The proposed approach is very fast to build, very compact while remaining robust to illumination variations, rotation changes, and noise. We develop a novel and simple strategy to compute a local binary descriptor based on the conventional local binary pattern (LBP) approach, preserving the advantageous characteristics of uniform LBP. Points are sampled in a circular neighborhood, but keeping the number of bins in a single-scale LBP histogram constant and small, such that arbitrarily large circular neighborhoods can be sampled and compactly encoded over a number of scales. There is no necessity to learn a texton dictionary, as in methods based on clustering, and no tuning of parameters is required to deal with different data sets. Extensive experimental results on representative texture databases show that the proposed BRINT not only demonstrates superior performance to a number of recent state-of-the-art LBP variants under normal conditions, but also performs significantly and consistently better in presence of noise due to its high distinctiveness and robustness. This noise robustness characteristic of the proposed BRINT is evaluated quantitatively with different artificially generated types and levels of noise (including Gaussian, salt and pepper, and speckle noise) in natural texture images.

DOWNLOAD.


An Efficient Parallel Approach for Sclera Vein Recognition

ABSTRACT:

Sclera vein recognition is shown to be a promising method for human identification. However, its matching speed is slow, which could impact its application for real-time applications. To improve the matching efficiency, we proposed a new parallel sclera vein recognition method using a two-stage parallel approach for registration and matching. First, we designed a rotation- and scale-invariant Y shape descriptor based feature extraction method to efficiently eliminate most unlikely matches. Second, we developed a weighted polar line sclera descriptor structure to incorporate mask information to reduce GPU memory cost. Third, we designed a coarse-to-fine two-stage matching method. Finally, we developed a mapping scheme to map the subtasks to GPU processing units. The experimental results show that our proposed method can achieve dramatic processing speed improvement without compromising the recognition accuracy.

DOWNLOAD.


A New Secure Image Transmission Technique via Secret-Fragment-Visible Mosaic Images by Nearly Reversible Color Transformations

ABSTRACT:

A new secure image transmission technique is proposed, which transforms automatically a given large-volume secret image into a so-called secret-fragment-visible mosaic image of the same size. The mosaic image, which looks similar to an arbitrarily selected target image and may be used as a camouflage of the secret image, is yielded by dividing the secret image into fragments and transforming their color characteristics to be those of the corresponding blocks of the target image. Skillful techniques are designed to conduct the color transformation process so that the secret image may be recovered nearly losslessly. A scheme of handling the overflows/underflows in the converted pixels’ color values by recording the color differences in the untransformed color space is also proposed. The information required for recovering the secret image is embedded into the created mosaic image by a lossless data hiding scheme using a key. Good experimental results show the feasibility of the proposed method.

DOWNLOAD.


RECENT PAPERS