Browsing by Author "Banzi, Jamal"
Now showing 1 - 5 of 5
Results Per Page
Sort Options
Item Deep Predictive Neural Network: Unsupervised Learning for Hand Pose Estimation(2019-08-15) Banzi, Jamal; Bulugu, Isack; Ye, ZhongfuThe discriminative approaches for hand pose estimation from depth images usually require dense annotated data to train a supervised network. Additionally, generative methods depend on temporal information in generating candidate poses which can be trapped due to local minima during the optimization process. Different from these methods, we propose a hybrid two-stage deep predictive neural network approach that performs predictive coding of image sequences of hand poses in order to capture latent features underlying a given image. Firstly, we train a deep convolutional neural network (CNN) for direct regression of hand joints position. Secondly, we add an unsupervised error term as a part of the recurrent architecture connected with predictive coding portion. An error regression term (ERT) ensures minimal residual errors of the estimated values while the predictive coding portion allows training of the network without the supervision of image sequences, so no dense annotation of data is required. We conduct a complete experiment using two challenging public datasets, ICVL and NYU. Using the ICVL datasets, our approach improved accuracy over the current state of the art methods with an average error joint of 7.5mm. We also achieve 12.2mm average error joint on NYU dataset which is the smallest error to be achieved on all state-of-art approaches.Item Higher-Order Local Autocorrelation Feature Extraction Methodology for Hand Gestures Recognition(IEEE, 2017-12-25) Bulugu, Isack; Ye, Zhongfu; Banzi, Jamal; Bulugu, IsackA novel feature extraction method for hand gesture recognition from sequences of image frames is described and tested. The proposed method employs higher order local autocorrelation (HLAC) features for feature extraction. The features are extracted using different masks from Grey-scale images for characterising hands image texture with respect to the possible position, and the product of the pixels marked in white. Then features with the most useful information are selected based on mutual information quotient (MIQ). Multiple linear discriminant analysis (LDA) classifier is adopted to classify different hand gestures. Experiments on the NUS dataset illustrate that the HLAC is efficient for hand gesture recognition compared with other feature extraction methods.Item Learning a deep predictive coding network for a semi-supervised 3D-hand pose estimation(IEEE, 2020-03-27) Bulugu, Isack; Banzi, Jamal; Huang, Shiliang; Ye, ZhongfuIn this paper we present a CNN based approach for a real time 3D-hand pose estimation from the depth sequence. Prior discriminative approaches have achieved remarkable success but are facing two main challenges: Firstly, the methods are fully supervised hence require large numbers of annotated training data to extract the dynamic information from a hand representation. Secondly, unreliable hand detectors based on strong assumptions or a weak detector which often fail in several situations like complex environment and multiple hands. In contrast to these methods, this paper presents an approach that can be considered as semi-supervised by performing predictive coding of image sequences of hand poses in order to capture latent features underlying a given image without supervision. The hand is modelled using a novel latent tree dependency model ( LDTM ) which transforms internal joint location to an explicit representation. Then the modeled hand topology is integrated with the pose estimator using data dependent method to jointly learn latent variables of the posterior pose appearance and the pose configuration respectively. Finally, an unsupervised error term which is a part of the recurrent architecture ensures smooth estimations of the final pose. Experiments on three challenging public datasets, ICVL, MSRA, and NYU demonstrate the significant performance of the proposed method which is comparable or better than state-of-the-art approaches.Item Learning Hand Latent Features For Unsupervised 3D Hand Pose Estimation(2019-05-06) Banzi, Jamal; Bulugu, IsackRecent hand pose estimation methods require large numbers of annotated training data to extract the dynamic information from a hand representation. Nevertheless, precise and dense annotation on the real data is difficult to come by and the amount of information passed to the training algorithm is significantly higher. This paper presents an approach to developing a hand pose estimation system which can accurately regress a 3D pose in an unsupervised manner. The whole process is performed in three stages. Firstly, the hand is modelled by a novel latent tree dependency model (LTDM) which transforms internal joints location to an explicit representation. Secondly, we perform predictive coding of image sequences of hand poses in order to capture latent features underlying a given image without supervision. A mapping is then performed between an image depth and a generated representation. Thirdly, the hand joints are regressed using convolutional neural networks to finally estimate the latent pose given some depth map. Finally, an unsupervised error term which is a part of the recurrent architecture ensures smooth estimations of the final pose. To demonstrate the performance of the proposed system, a complete experiment is conducted on three challenging public datasets, ICVL, MSRA, and NYU. The empirical results show the significant performance of our method which is comparable or better than state-of-the-art approaches.Item A novel Hand Pose Estimation using Dicriminative Deep Model and Transductive Learning Approach for Occlusion Handling and Reduced Descrepancy(IEEE, 2017-05-11) Banzi, Jamal; Bulugu, Isack; Ye, ZhongfuDiscriminative based model have demonstrated an epic distinction in hand pose estimation. However there are key challenges to be solved on how to intergrate the self-similar parts of fingers which often occlude each other and how to reduce descrepancy among synthetic and realistic data for an accurate estimation. To handle occlusion which lead to inaccurate estimation, this paper presents a probabilistic model for finger position detection framework. In this framework the visibility correlation among fingers aid in predicting the occluded part between fingers thereby estimating hand pose accurately. Unlike convectional occlusion handling approach which assumes occluded parts of fingers as independent detection target, this paper presents a discriminative deep model which learns the visibility relationship among the occluded parts of fingers at multiple layers. In addition, we propose the semi-supervised Transductive Regression(STR) forest for classification and regression to minimise discrepancy among realistic and synthetic pose data. Experimental results demonstrate promising performance with respect to occlusion handling, and discrepancy reduction with higher degree of accuracy over state-of-the-art approaches.