Real Traffic Dynamics and its Energy-Efficient Service Provisioning in 5G Mobile Networks
As mobile video and MTC (machine-type communication) traffic is fast growing, next-generation mobile communication (5G) networks are expected to further provide 1000-fold more capacity and 100-fold number of connections than today’s mobile networks simultaneously. Meanwhile, the energy consumption of mobile networks has emerged as one of the major parts of the CO2 emission worldwide and therefore 5G networks have to be also much more energy-efficient (green) than today. To deal with these challenges, the traditional peak traffic based network dimensioning and resource allocation mechanisms have to be revisited or, in other words, 5G networks have to be more adaptive (smart) to mobile traffic dynamics. In this talk, we first characterize mobile traffic dynamics (in both temporal and spatial domains and for both mobile video and MTC traffic) by using real measurement data from commercial mobile networks. Then we propose several smarter and greener service provisioning mechanisms to exploit the traffic dynamics. Through numerical analyses, we found that the traffic dynamics can in fact help to improve the energy-efficiency of mobile networks. This provides a new direction for managing mobile traffics and their service provisioning in future networks as mobile traffic will get more and more uncertain and dynamic.
Zhisheng Niu graduated from Northern Jiaotong University (currently Beijing Jiaotong University), Beijing, China, in 1985, and got his M.E. and D.E. degrees from Toyohashi University of Technology, Toyohashi, Japan, in 1989 and 1992, respectively. After spending two years at Fujitsu Laboratories Ltd., Kawasaki, Japan, he joined with Tsinghua University, Beijing, China, in 1994, where he is now a professor at the Department of Electronic Engineering. His major research interests include queueing theory, traffic engineering, mobile Internet, radio resource management of wireless networks, and green communication and networks.
Dr. Niu has been an active volunteer for various academic societies, including Chair of Emerging Technologies Committee (2014-15), Director for Conference Publications (2010-11), and Director for Asia-Pacific Board (2008-09) of IEEE Communication Society, Membership Development Coordinator (2009-10) of IEEE Region 10, Councilor of IEICE-Japan (2009-11), and standing committee member of Chinese Institute of Communications (2012-16). He is a distinguished lecturer of both IEEE Communication Society (2012-15) and Vehicular Technology Society (2014-16). He received the Outstanding Young Researcher Award from Natural Science Foundation of China in 2009. He is now a fellow of IEEE and IEICE (Japan).
Helen M. MENG
Chinese Univ. of Hong Kong
The Science of Speech and Language Data for Learning and Health
Speech and language constitute primary forms of human communication. Aside from facilitating human-computer interaction, speech and language processing techniques may unveil insights regarding acoustics, phonetics and linguistic topics that contribute towards language acquisition, rehabilitation and knowledge engineering. This talk will present the challenges, existing strategies and our ongoing work in adapting automatic speech recognition and synthesis technologies for pronunciation training, disordered speech analysis and also topic modeling techniques for conference publication analytics to track topical evolutions in the area of speech and language research.
Helen Meng is Professor and Chairman of the Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong. She received all her degrees from MIT and joined CUHK in 1998. She is the Founding Director of the CUHK MoE-Microsoft Key Laboratory for Human-Centric Computing and Interface Technologies and the CUHK Stanley Ho Big Data Decision Analytics Research Center. Helen’s professional services include former Editor-in-Chief of the IEEE Transactions on Audio, Speech and Language Processing, IEEE Signal Processing Society Board of Governors, Chairlady of the Working Party for the Manpower Survey of the Information Technology Sector in Hong Kong, and memberships in the HKSAR Government’s Research Grants Council, Steering Committee on Electronic Health Record Sharing, and Digital 21 Strategy Advisory Committee. Helen received the Ministry of Education Higher Education Outstanding Scientific Research Output Award in Technological Advancements and CUHK's Faculty of Engineering Exemplary Teaching Award, Young Researcher Award, and Service Award in previous years. She is elected ISCA Distinguished Lecturer 2015-2016 and also received the Hong Kong Computing Society’s Outstanding ICT Women Award 2015. Helen is a Fellow of the Hong Kong Computer Society, the Hong Kong Institution of Engineers and the IEEE.
Theory and Methods of Signal and Data Science
University of Electronic Science and Technology of China
Low-Rank Tensor Decomposition and Its Applications in Data Analysis and Wireless Communications
Multi-dimensional data arise in a variety of applications, such as recommender systems, multirelational networks, and brain-computer imaging. Tensors (i.e. multiway arrays) provide an effective representation of such data. Tensor decomposition based on low rank approximation is a powerful technique to extract useful information from multiway data as many real-world multiway data are lying on a low dimensional subspace. Recent years have seen a resurgence of interest in tensor, motivated by a number of applications involving real-world multiway data.
In this talk, I will first provide a brief review on tensor decompositions, including Tucker decomposition and CANDECOMP/PARAFAC (CP) decomposition which are two widely used low-rank tensor decompositions. Then, I will review recent advances on Tucker decomposition, and present a new iteratively reweighted method for Tucker decomposition of incomplete multiway tensors. Experimental results on image inpainting/denoising and classification are provided to illustrate the performance of the proposed method and other competing Tucker decomposition methods. Lastly, I will show that CP decomposition, as a powerful multilinear tensor algebra, can be used to achieve a substantial overhead reduction for Millimeter wave channel estimation. Simulation results show that CP decomposition-based method is more computationally efficient than compressed sensing-based techniques for Millimeter wave channel estimation, meanwhile achieving a better estimation accuracy.
Jun Fang received the B. Sc. and M. Sc. degrees from Xidian University. Xi'an, China, in 1998 and 2001, respectively. He received his Ph.D. degree in eletrical engineering from the National University of Singapore, Singapore, in 2006. During 2006, he was with the Department of Electrical and Computer Engineering, Duke University, as a postdoctoral research associate. From January 2007 to December 2010, he was a research associate with the Department of Electrical and Computer Engineering, Stevens Institute of Technology. He joined the University of Eletronic Science and Technology of China (UESTC) in 2011. Dr. Fang has authored/co-authored more than 30 papers in leading IEEE journals. He received the Outstanding Paper Award at the 2011 IEEE Africon Conference (co-authored with P. Wang, N. Han, and H. Li). In 2013, he received the IEEE Vehicular Technology Soceity (VTS) Jack Neubauer Memorial Award for the best systems paper "Multi-antenna assisted spectrum sensing in cognitive radio" (P. Wang, J. Fang, N. Han, H. Li) published in IEEE Trans. Vehicular Technology in May, 2010. He is currently an Associate Technical Editor for IEEE Communications Magazine, and an Associate Editor for IEEE Signal Processing Letters.
From Lp Minimization to Weakly Convex Optimization
It is ubiquitous that signals of interest often possess inherent low-dimensional structures, such as sparsity of vectors and low rank of matrices. How to effectively take advantage of these structures has been a core research topic in signal processing society for the last decade. For numerous applications, the key issue can be formulated as sparse recovery problems. Algorithms based on non-convex lp minimization has been shown to have superior performance compared to l1 minimization, but lacking convergence guarantees from the initialization to the sparse signal has been the Achilles heel of these algorithms.
We begin with theoretical analysis of lp minimization problem, including results about global and local optimality. By incorporating the concept of weakly convex, a class of sparsity-inducing penalties is then introduced with characterization of their non-convexity, and can be used to approximate lp ``norm''. It is proved that the global optimality of the corresponding weakly convex optimization problem approaches the global optimality of lp minimization. To ensure that the sparse signal is the only local optimum of the weakly convex optimization problem in its neighborhood, the requirement of the non-convexity and the radius of the neighborhood is established.
We then devise algorithms of the weakly convex optimization problems for different scenarios. It is proved that if the non-convexity is in inverse proportional to the distance between the initialization and the sparse signal, these algorithms are guaranteed to converge to the sparse signal. This is the first comprehensive convergence analysis of algorithms for sparse recovery based on non-convex optimization. Numerical simulations are implemented to verify the theoretical results, and indicate that the algorithms can recover signals with more nonzero elements with less running time, and have better denoising performance.
Yuantao Gu (S'02-M'03) received the B.E. degree from Xi’an Jiaotong University, Xi’an, China, in 1998, and the Ph.D. degree with honor from Tsinghua University, Beijing, China, in 2003, both in Electronic Engineering. He joined the faculty of Tsinghua in 2003 and is now an Associate Professor with Department of Electronic Engineering. He was a visiting scientist at Microsoft Research Asia, Beijing during December 2005 to February 2006, Research Laboratory of Electronics at Massachusetts Institute of Technology (MIT), Cambridge, MA during August 2012 to August 2013, and Department of Electrical Engineering and Computer Science at the University of Michigan in Ann Arbor, MI during September to October 2015. He is the author/coauthor of two textbooks on Signals and Systems, and more than 80 papers on signal processing, multimedia communications, and wireless networks. His research interests include adaptive filtering, sparse signal recovery, multimedia signal processing, and related topics in wireless communications and information networks. Currently, he serves as Associate Editor for IEEE Transactions on Signal Processing and Handling Editor for EURASIP Digital Signal Processing.
Communications and Networking
Huazhong University of Science and Technology
QoE-Aware Video Delivery with D2D Communications
With the growing popularity of smart devices (e.g., smart phones and tablets), people enjoy watching high-quality videos on their mobile devices anytime and anywhere by utilizing ubiquitous wireless networks. Meanwhile, the Quality-of-Experience (QoE) is considered as one of the next-generation video quality metrics from the user’s perspective. Therefore, the guarantee of the QoE when delivering video over cellular network is more complicated than the guarantee of the transmission rate since the tactile-level scheduling is required for the QoE guarantee. To realize the tactile-level scheduling and release the power of short-range communications, we propose a novel QoE-aware network architecture with in-network caching and computation capabilities. Specifically, the QoE can be enhanced through offloading mobile video traffic from far base stations to near user equipment helpers via device-to-device (D2D) communications. For D2D-assisted video delivery, we begin to study three problems: the first is the incentive problem, which is about how to stimulate users to participate into D2D-assisted video relaying/sharing to maximize the probability of enhancing the QoE; the second is the energy problem, i.e., how to balance the UE energy consumption for the purpose of prolonging the lifetime of the QoE-aware user cluster, and the third is the efficiency problem that focuses on the QoE-aware radio resource allocation mechanism that maximizes the resource utilization efficiency. In summary, we provide a clear and unique perspective to QoE-aware video delivery in this talk.
Tao Jiang is currently a Distinguished Professor in the School of Electronics Information and Communications, Huazhong University of Science and Technology, Wuhan, P. R. China. He received the B.S. and M.S. degrees in applied geophysics from China University of Geosciences, Wuhan, P. R. China, in 1997 and 2000, respectively, and the Ph.D. degree in information and communication engineering from Huazhong University of Science and Technology, Wuhan, P. R. China, in April 2004. From Aug. 2004 to Dec. 2007, he worked in some universities, such as Brunel University and University of Michigan-Dearborn, respectively. He has authored or co-authored over 200 technical papers in major journals and conferences and 8 books/chapters in the areas of communications and networks. He served or is serving as symposium technical program committee membership of some major IEEE conferences, including INFOCOM, GLOBECOM, and ICC, etc.. He is invited to serve as TPC Symposium Chair for the IEEE GLOBECOM 2013, IEEEE WCNC 2013 and ICCC 2013. He is served or serving as associate editor of some technical journals in communications, including in IEEE Transactions on Signal Processing, IEEE Communications Surveys and Tutorials, IEEE Transactions on Vehicular Technology, and IEEE Internet of Things Journal, etc.. He is a recipient of the NSFC for Distinguished Young Scholars Award in 2013, and he is also a recipient of the Young and Middle-Aged Leading Scientists, Engineers and Innovators by the Ministry of Science and Technology of China in 2014. He was awarded as the Most Cited Chinese Researchers in Computer Science announced by Elsevier in 2014 and 2015. He is a senior member of IEEE.
University of Electronic Science and Technology of China
Pilot Spoofing Attack in Massive MIMO Systems
The pilot spoofing attack is one kind of active eavesdropping conducted by a malicious user during the channel estimation phase of the legitimate transmission. In this attack, an intelligent adversary spoofs the transmitter on the estimation of channel state information (CSI) by sending the identical pilot signal as the legitimate receiver, in order to obtain a larger information rate in the data transmission phase. The pilot spoofing attack could also drastically weaken the strength of the received signal at the legitimate receiver if the adversary utilizes large enough power. Motivated by the serious problems the pilot spoofing attack could cause, we propose an efficient detector, named energy ratio detector (ERD), by exploring the asymmetry of received signal power levels at the transmitter and the legitimate receiver when there exists a pilot spoofing attack. We also propose solutions for recovering secure communications once such attack is identified.
Dr Ying-Chang Liang (F’11) is a Professor in the University of Electronic Science and Technology of China (UESTC), China. He was a Principal Scientist and Technical Advisor in the Institute for Infocomm Research (I2R), Singapore, a visiting scholar in the Department of Electrical Engineering, Stanford University, CA, USA, from Dec 2002 to Dec 2003, and was an adjunct faculty in National University of Singapore and Nanyang Technological University from 2004 – 2009. His research interest lies in the general area of wireless networking and communications, with current focus on applying artificial intelligence, big data analytics and machine learning techniques to wireless network design and optimization.
Dr Liang was elected a Fellow of the IEEE in December 2010, and was recognized by Thomson Reuters as a Highly Cited Researcher in 2014 and 2015. He received IEEE Jack Neubauer Memorial Award in 2014, the First IEEE Communications Society APB Outstanding Paper Award in 2012, and the EURASIP Journal of Wireless Communications and Networking Best Paper Award in 2010. He also received the Institute of Engineers Singapore (IES)’s Prestigious Engineering Achievement Award in 2007, and the IEEE Standards Association’s Outstanding Contribution Appreciation Award in 2011, for his contributions to the development of IEEE 802.22 standard.
Dr Liang is now serving as the Chair of IEEE Communications Society Technical Committee on Cognitive Networks. He is on the Editorial Board of IEEE Signal Processing Magazine, and IEEE Transactions on Signal and Information Processing over Networks, and is an Associate Editor-in-Chief of the World Scientific Journal on Random Matrices: Theory and Applications. He served as Founding Editor-in-Chief of IEEE Journal on Selected Areas in Communications – Cognitive Radio Series, and was the key founder of the new journal IEEE Transactions on Cognitive Communications and Networking. He was an Editor of IEEE Transactions on Wireless Communications from 2002 to 2005, and an Associate Editor of IEEE Transactions on Vehicular Technology from 2008 to 2012, and a (Leading) Guest Editor of five special issues on emerging topics published in IEEE, EURASIP and Elsevier journals. Dr Liang was a Distinguished Lecturer of the IEEE Communications Society and the IEEE Vehicular Technology Society, and has been a member of the Board of Governors of the IEEE Asia-Pacific Wireless Communications Symposium since 2009. He served as Technical Program Committee (TPC) Chair of CROWN’08 and DySPAN’10, Symposium Chair of ICC’12 and Globecom’12, General Co-Chair of ICCS’10 and ICCS’14. He serves as TPC Chair and Executive Co-Chair of Globecom’17 to be held in Singapore.
Shanghai Jiaotong University
Multicasting and Caching for Content-Centric Wireless Networks
The driving forces behind the exponential growth of mobile data traffic have fundamentally shifted from being “connection-centric” communications, such as phone calls and text messages, to the explosion of “content-centric” communications, such as video streaming and content sharing. In this talk, I will present an initial attempt towards the design of content-centric wireless networks by incorporating multicasting and caching in the physical layer. We consider a cache-enabled cloud radio access network. Users requesting a same content form a multicast group and are served by a same cluster of base stations (BSs) cooperatively. Each BS acquires the requested contents either from its local cache or from a central processor via backhaul links. We formulate an optimization problem of content-centric BS clustering and multicast beamforming to minimize the total network cost subject to the content QoS constraints. Our simulation results demonstrate that the proposed multicast-based content-centric transmission offers significant power reduction than the conventional user-centric design. The effects of some heuristic caching strategies in backhaul cost reduction are also evaluated.
Meixia Tao received the B.S. degree from Fudan University, Shanghai, China, in 1999, and the Ph.D. degree from Hong Kong University of Science and Technology in 2003. She is currently a Professor with the Department of Electronic Engineering, Shanghai Jiao Tong University, China. Prior to that, she was a Member of Professional Staff at Hong Kong Applied Science and Technology Research Institute during 2003-2004, and a Teaching Fellow then an Assistant Professor at the Department of Electrical and Computer Engineering, National University of Singapore from 2004 to 2007. Her current research interests include content-centric wireless networks, resource allocation, interference management and coordination, and physical layer security.
Dr. Tao is a member of the Executive Editorial Committee of the IEEE Transactions on Wireless Communications. She serves as an Editor for the IEEE Transactions on Communications and the IEEE Wireless Communications Letters. Dr. Tao is the recipient of the IEEE Heinrich Hertz Award for Best Communications Letters in 2013 and the IEEE ComSoc Asia-Pacific Outstanding Young Researcher Award in 2009. She also receives the best paper awards from IEEE/CIC ICCC 2015 and IEEE WCSP 2012.
Speech, Audio and Language processing
CAS Institute of Automation
Multimodal Dimensional Emotion Recognition
The talk will summary the state of art research on emotion recognition at the beginning. After that, the talk will introduce the method that predicts the continuous values of the emotion dimensions arousal and valence from audio and visual modalities. The state of art classifier for dimensional recognition, long short term memory recurrent neural network (LSTM-RNN) is used. Except regular LSTM-RNN prediction architecture, two techniques are investigated for dimensional emotion recognition problem. The first one is ε - insensitive loss is used as the loss function to optimize. Compared to squared loss function, which is the most widely used loss function for dimension emotion recognition, ε-insensitive loss is more robust for the label noises and it can ignore small errors to get stronger correlation between predictions and labels. The other one is temporal pooling. This technique enables temporal modeling in the input features and increases the diversity of the features fed into the forward prediction architecture. Experiments results show the efficiency of key points of the proposed method and competitive results are obtained.
Prof. Jianhua Tao received his PhD from Tsinghua University in 2001, and got his Ms from Nanjing University in 1996. He is currently the Professor and the Deputy Director of National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences. He is also the Chief Professor in University of Chinese Academy of Sciences, and the key member of The Center for Excellence in Brain Science and Intelligence Technology (CEBSIT).
His current research interests include Speech Recognition and Synthesis, Emotion Recognition, Multimodal Human Computer Interaction and Pattern Recognition. He has published more than 150 papers on major journals and proceedings, and got several awards from the important conferences, such as Eurospeech, NCMMSC, etc. Due to his excellent work, he got China National Funds for Distinguished Young Scientists in 2014. His work gets several supports from 863, NSFC and other national funds, and he is the Chief-Scientist of 863 important project “Multimodal Natural Interaction Technology for Mobile Devices”.
Currently, He serves as the Steering Committee Member for IEEE Transactions on Affective Computing, Subject Editor for Speech Communication, Associate Editor for Journal on Multimodal User Interface and International Journal on Synthetic Emotions, Deputy Editor-in-chief for Chinese Journal of Phonetics. He also serves as the chair or program committee member for several major conferences, including ICPR, ACII, ICMI, ISCSLP, NCMMSC, CHCI etc.
Theoretical Problems in Speaker Verification Applications
Biometric recognition technologies have been attracting more and more attention recently, including face recognition and fingerprint recognition technologies, especially in the e-banking area. While users are experiencing the convenience brought by such technologies, they are also suffering from a lot of problems during application. In this talk, taking voiceprint recognition as an example, the speaker will address some theoretical problems required by and in the speaker verification applications, including time-varying robustness, physical condition robustness, anti-spoofing, real intention detection.
Dr. Thomas Fang Zheng is a full research professor and Director of the Center for Speech and Language Technologies (CSLT), RIIT, THU.
Since 1988, he has been working on speech and language processing. So far, he has published 250+ journal and conference papers, 11 of which were titled the Excellent Papers, and 11 books.
He has been serving in many conferences, journals, and organizations.
He is an IEEE Senior member, a CCF (China Computer Federation) Senior Member, a council member of Chinese Information Processing Society of China, a council member of the Acoustical Society of China, and so on.
He serves as Council Chair of Chinese Corpus Consortium (CCC), APSIPA (Asia-Pacific Signal and Information Processing Association) Vice President – Institutional Relations and Education Program (Term 2013-2014), APSIPA Advisory Board member (Term 2015-2016) and Vice President – Conference (Term 2016-2017), Chair of the Steering Committee of the National Conference on Man-Machine Speech Communication (NCMMSC) of China (Terms 2007-2010 and 2011-2014), head of the Voiceprint Recognition (VPR) special topic group of the Chinese Speech Interactive Technology Standard Group, Vice Director of Subcommittee 2 on Human Biometrics Application of Technical Committee 100 on Security Protection Alarm Systems of Standardization Administration of China (SAC/TC100/SC2).
He is an associate editor of IEEE Transactions on Audio, Speech, and Language Processing, a member of editorial board of Speech Communication, a member of editorial board of APSIPA Transactions on Signal and Information Processing, an associate editor of International Journal of Asian Language Processing, a series editor of SpringerBriefs in Signal Processing, and a member of editorial committee of the Journal of Chinese Information Processing.
He ever served as co-chair of Program Committee of International Symposium on Chinese Spoken Language Processing (ISCSLP) 2000, member of Technical Committee of ISCSLP 2000, member of Organization Committee of Oriental COCOSDA 2000, member of Program Committee of NCMMSC 2001, member of Scientific Committee of ISCA Tutorial and Research Workshop on Pronunciation Modeling and Lexicon Adaptation for Spoken Language Technology 2002, member of Organization Committee and international advisor of Joint International Conference of SNLP-O-COCOSDA 2002, General Chair of Oriental COCOSDA 2003, member of Scientific Committee of International Symposium on Tonal Aspects of Languages (TAL) 2004, member of Scientific Committee and Session Chair of ISCSLP 2004, chair of Special Session on Speaker Recognition in ISCSLP 2006, Program Committee Chair of NCMMSC 2007, Program Committee Chair of NCMMSC 2009, Tutorial Co-Chair of APSIPA ASC 2009, Program Committee Chair of NCMMSC 2011, General Co-Chair of APSIPA ASC 2011, APSIPA Distinguished Lecturer (2012-2013), general co-chair of NCMMSC 2013, General Co-Chair of IEEE ChinaSIP (IEEE China Summit and International Conference on Signal and Information Processing) 2013, Area Chair of Interspeech 2014, General Co-Chair of ISCSLP 2014, Publication Chair of ICASSP 2016.
He has been also working on the construction of "Study-Research-Product" channel, devoted himself in transferring speech and language technologies into industries, including language learning, embedded speech recognition, speaker recognition for public security and telephone banking, location-centered intelligent information retrieval service, and so on. Now he holds over 15 patents in various aspects of speech and language technologies.
Recently, he received 1997 Beijing City Patriotic and Contributing Model Certificate, 1999 National College Young Teacher (Teaching) Award issued by the Fok Ying Tung Education Foundation of the Ministry of Education (MOE), 2000 1st Prize of Beijing City College Teaching Achievement Award, 2001 2nd Prize Beijing City Scientific and Technical Progress Award, 2007 3rd Prize of Science and Technology Award of the Ministry of Public Security, and 2009 China "Industry-University-Research Institute" Collaboration Innovation Award.
Image, Video, and Multimedia Processing
Microsoft Research - Asia
Video Captioning: Bridging Video and Language with Deep Learning
The recent advances in deep learning have boosted the research on video analysis. For example, convolutional neural networks have demonstrated the superiority on modeling high-level visual concepts, while recurrent neural networks have been proven to be good at modeling mid-level temporal dynamics in the video data. We present a few recent advances for understanding video content using deep learning techniques. This talk will focus on translating video to language, which is one of the ultimate goals of video understanding in the computer vision and multimedia research. Specifically, we will present recent work bridging video and language with joint embedding and translation, which achieves the best to-date performance in this nascent vision task. We will also talk about our new dataset and future directions for video and language.
Dr. Tao Mei is a Lead Researcher with Microsoft Research Asia. His current research interests include multimedia information retrieval and computer vision. He has authored or co-authored over 100 papers in journals and conferences and holds 15 U.S. granted patents. Tao was the recipient of several paper awards from prestigious multimedia journals and conferences, including the IEEE T-CSVT Best Paper Award in 2014, the IEEE T-MM Prize Paper Award in 2013, and the Best Paper Awards at ACM Multimedia in 2009 and 2007, etc. He is an Associate Editor of IEEE Trans. on Multimedia (TMM), ACM Trans. on Multimedia Computing, Communications, and Applications (TOMM), Machine Vision and Applications (MVA), and Multimedia Systems (MMSJ). He is the General Co-chair of ACM ICIMCS 2013, the Program Co-chair of ACM MM 2018, IEEE ICME 2015, IEEE MMSP 2015 and MMM 2013. He received the B.E. degree in automation and the Ph.D. degree in pattern recognition and intelligent systems from the University of Science and Technology of China, Hefei, China, in 2001 and 2006, respectively.
Shanghai Jiaotong University
Structured Sparse Learning for Signal Processing and Vision
In signal processing, sparse coding consists of representing data with linear combinations of a few dictionary elements. Structured sparse signal representation models further regularize the sparse estimation by assuming dependency on the selection of the active atoms. This talk addresses our ongoing efforts in signal processing and vision, which is preceded by incorporating structured sparsity into generalized representation to fit the varying nonstationary statistics in sampling. More specifically, the dictionary is learned and adapted to data, yielding a compact representation. Over existing scattering networks, we explores jointly learning a deep scattering convolution network by casting the classification problem as a multiple kernel learning problem. The convolution paths of the network are kernelized, respectively, to be selected in a large-margin context. Furthermore, the invertible topology is constructed with iterated filter banks in order to make reconstruction in a variety of contexts. Finally, we show the promising examples in vision by learning the multi-task semantic codebook via submodular optimization.
Hongkai Xiongis currently a distinguished Professor in the Department of Electronic Engineering, Shanghai Jiao Tong University (SJTU). Since he received his Ph.D. degree from SJTU in 2003, he has been with Department of Electronic Engineering in SJTU. During 2007- 2008, he was a research scholar in the Department of Electrical and Computer Engineering of Carnegie Mellon University (CMU). From 2011 to 2012, he was a scientist with the Department of Biomedical Informatics at the University of California, San Diego (UCSD).
Dr. Xiong’s research interests include signal processing, multimedia communication, source coding and computer vision. He published over 170 refereed journal and conference papers. His research projects are funded by NSF, QUALCOMM, MICROSOFT, and INTEL. He was the recipient of the Best Student Paper Award at the 2014 IEEE Visual Communication and Image Processing (IEEE VCIP’14), the Best Paper Award at the 2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (IEEE BMSB’13), and the Top 10% Paper Award at the 2011 IEEE International Workshop on Multimedia Signal Processing (IEEE MMSP’11). He served as TPC members for prestigious conferences such as ACM Multimedia, ICIP, ICME, and ISCAS.
In 2014, Dr. Xiong was granted National Science Fund for Distinguished Young Scholar and Shanghai Youth Science and Technology Talent as well. In 2013, he was awarded a recipient of Shanghai Shu Guang Scholar. From 2012, he is a member of Innovative Research Groups of the National Natural Science. In 2011, he obtained the First Prize of the Shanghai Technological Innovation Award for “Network-oriented Video Processing and Dissemination: Theory and Technology”. In 2010 and 2013, he obtained the SMC-A Excellent Young Faculty Awards of Shanghai Jiao Tong University. In 2009, he was awarded a recipient of New Century Excellent Talents in University, Ministry of Education of China. He is a senior member of the IEEE (2010).
Hong Kong PolyU.
Weighted Nuclear Norm Minimization and Its Applications to Low Level Vision
As a convex relaxation of the rank minimization model, the nuclear norm minimization (NNM) problem has been attracting significant research interest in recent years. The standard NNM regularizes each singular value equally, composing an easily calculated convex norm. However, this restricts its capability and flexibility in dealing with many practical problems, where the singular values have clear physical meanings and should be treated differently. We study the weighted nuclear norm minimization (WNNM) problem, which adaptively assigns weights on different singular values. As the key step of solving general WNNM models, the theoretical properties of the weighted nuclear norm proximal (WNNP) operator are investigated. We show that WNNP is equivalent to a standard quadratic programming problem with linear constrains, which facilitates solving the original problem with off-the-shelf convex optimization solvers. In particular, when the weights are sorted in a non-descending order, its optimal solution can be easily obtained in closed-form. Multiple extensions of WNNM, including robust PCA and matrix completion, can be readily constructed under the augmented Lagrange multiplier paradigm. The proposed WNNM methods achieve state-of-the-art performance in typical low level vision tasks, including image denoising, background subtraction and image inpainting.
Lei Zhang (M’04, SM’14) received the B.Sc. degree in 1995 from Shenyang Institute of Aeronautical Engineering, Shenyang, P.R. China, the M.Sc. and Ph.D degrees in Control Theory and Engineering from Northwestern Polytechnical University, Xi’an, P.R. China, respectively in 1998 and 2001. From 2001 to 2002, he was a research associate in the Dept. of Computing, The Hong Kong Polytechnic University. From Jan. 2003 to Jan. 2006 he worked as a Postdoctoral Fellow in the Dept. of Electrical and Computer Engineering, McMaster University, Canada. In 2006, he joined the Dept. of Computing, The Hong Kong Polytechnic University, as an Assistant Professor. Since July 2015, he has been a Full Professor in the same department. His research interests include Computer Vision, Pattern Recognition, Image and Video Processing, and Biometrics, etc. Prof. Zhang has published more than 200 papers in those areas. By 2016, his publications have been cited more than 17,000 times in literature. Prof. Zhang is an Associate Editor of IEEE Trans. on Image Processing, IEEE Trans. on CSVT and Image and Vision Computing. He was selected as the Highly Cited Researcher by Thomson Reuters. More information can be found in his homepage
Towards Global Rate Distortion Optimization in Video Coding
Rate-distortion optimization (RDO) plays a crucial role to maximize coding efficiency in video coding. In the current block-based hybrid video coding using motion compensation, the RDO is typically performed on the block level individually and independently, which is far from being optimal as it ignores the strong spatial-temporal dependency among the block/frame coding optimizations. However, a global RDO problem becomes so complex that the processing of each coding unit is dependent and entangled each other due to the extensive use of spatial-temporal predictions in video coding. As reported in our previous work, we formulate the dependent RDO problem in the context of motion compensated hybrid video coding and then simplify the problem to temporal-dependent RDO (TD-RDO) or inter-frame RDO in view that temporal dependency dominates the coding process. To address the TD-RDO problem, we first develop a source distortion temporal propagation model, which attempts to estimate the influence of the current coding unit to the future coding units in a temporal propagation chain by introducing a temporal distortion propagation factor. On the other hand, by thoroughly examining the effect of coding the current unit on the following coding units due to inter-frame prediction under the high-rate assumption, we then develop another inter-frame dependent RDO scheme. In the two different ways, we can show that the TD-RDO can be achieved by adapting the corresponding Lagrange multiplier. Some insights are gained by comparing the two TD-RDO approaches against the conventional independent RDO on the platforms of H.264/AVC reference software and the state-of-the-art HEVC reference software, respectively, with discussions on the goal of attaining global RDO in video coding.
Ce Zhu (M’03–SM’04) received the B.S. degree from Sichuan University, Chengdu, China, and the M.Eng and Ph.D. degrees from Southeast University, Nanjing, China, in 1989, 1992 and 1994, respectively, all in electronic and information engineering. He pursued postdoctoral research at Chinese University of Hong Kong in 1995, City University of Hong Kong, and University of Melbourne, Australia, from 1996 to 1998. He had been with Nanyang Technological University, Singapore, for 14 years from 1998 to 2012, where he had been a Research Fellow, Program Manager, Assistant Professor and then Associate Professor from 2005. He has also held visiting positions at Queen Mary, University of London, U.K., and Nagoya University, Japan. He is currently a Professor with the School of Electronic Engineering, University of Electronic Science and Technology of China, China.
His research interests include image/video coding, streaming and processing, 3D video, visual perception and applications. He has authored or co-authored 140+ papers in journals and conferences, which have received 2500+ citations (around 60% from his first-authored papers). As the lead editor, he has edited three books and contributed four book chapters. He has filed 20+ patents (10 granted and 1 transferred), and has contributed 28 proposals (18 adopted) to the international standardization bodies of JCT-3V and MPEG IVC as well as the China AVS standardization (mainly on encoding optimization, with a few techniques adopted in the standard reference software). He has received two best paper awards (as the first author) and one student paper award (as a co-author with his student) at three international conferences (IIH-MSP 2013, IEEE BMSB 2014, and MobiMedia 2011, respectively).
He has served on the editorial boards of seven journals, including as an Associate Editor of IEEE Signal Processing Letters (since 2010), IEEE Transactions on Broadcasting (since 2010), IEEE Transactions on Circuits and Systems for Video Technology (since 2013), Editor of IEEE Communications Surveys and Tutorials (2012-Feb. 2015), Area Editor of Signal Processing: Image Communication (since 2011), Editorial Board Member of Multimedia Tools and Applications (since 2009), and Associate Editor of Multidimensional Systems and Signal Processing (2009-2014). He is/was a Guest Editor for three special issues of journals. He has served on technical/program committees, organizing committees and as track/area/session chairs for about 60 international conferences, including serving as a Technical Program Chair of IEEE ChinaSIP 2015 and a Program Co-Chair of 21st International Packet Video Workshop (PV 2015). He has given a few tutorials at international conferences, including at the IEEE ISCAS 2015, IEEE VCIP 2014, PCM 2014, and has also been invited to deliver keynote speeches at three international conferences. He is a Co-Chair of the Interest Group of Multimedia Processing for Communications (MPCIG) of Technical Committee on Multimedia Communications (MMTC), IEEE Communications Society, and a Chair of Special Interest Group on Big Multimedia (http://www.computer.org/web/tcmc/sigbigmm) of Technical Committee on Multimedia Computing, IEEE Computer Society. He is an elected member of the Multimedia Signal Processing Technical Committee (MMSP-TC, 2015-2017) of the IEEE Signal Processing Society, and of the Multimedia Systems and Applications Technical Committee (MSA-TC, 2004-2012, 2015-2019) of the IEEE Circuits and Systems Society. He received 2010 Special Service Award from IEEE Broadcast Technology Society, and is an IEEE BTS Distinguished Lecturer. He is a Fellow of IET.
Pattern Recognition & Machine Learning
CAS Institute of Computing
Why a Dog Is a Dog
Object categorization and scene understanding is a challenge problem in computer vision. In this talk, I will discuss some basic issues in object categorization. An object exists in an ecosystem related to its property from many aspects. We model categorization as a measure similarity in high dimensional semantic space. A real world categorization or scene understanding task usually contains requirements from many aspects. To embed both appearance and semantic similarity in these tasks, we propose to learn the binary code which encodes identity and semantic attributes simultaneously. The basic idea is combined with both handcraft feature and learned feature. Different combination of training and code generations are compared. Extensive experiments are conducted on both face and general image sets. The results on these datasets show the promising performance.
Prof. Xilin Chen received the B.S., M.S., and Ph.D. degrees in computer science from the Harbin Institute of Technology, Harbin, China, in 1988, 1991, and 1994, respectively. He was a professor with the Harbin Institute of Technology from 1999 to 2005. He was a visiting scholar with Carnegie Mellon University, Pittsburgh, PA, from 2001 to 2004. He has been a professor with the Institute of Computing Technology, Chinese Academy of Sciences (CAS), since August 2004. He is the Director of the Key Laboratory of Intelligent Information Processing, CAS. He has published one book and over 200 papers in refereed journals and proceedings in the areas of computer vision, pattern recognition, image processing, and multimodal interfaces. He is a leading editor of the Journal of Computer Science and Technology, and an associate editor in chief of the Chinese Journal of Computers. He served as an Organizing Committee / Program Committee Member for more than 50 conferences. He is a recipient of several awards, including the China's State Scientific and Technological Progress Award in 2000, 2003, 2005, and 2012 for his research work. He is a Fellow of the China Computer Federation (CCF).
Pattern Recognition in Combined Cyber-physical-cognitive Space
The conventional pattern recognition (PR) system is constructed across the physical space and cyberspace. That is to say, through some sensors, like imaging sensors, we can capture the images or videos of object-of-interest in the real world (physical space). Then the identification of the object can be authorized by the PR system in the cyberspace. It is obvious that it is a kind of man-out-of-the-loop system. Some practical applications have validated that with the help of human experience knowledge and cognition, the performance of pattern recognition system can be improved for some complex applications. Therefore, it is important to construct a pattern recognition system based on the fusion of the physical space, cyberspace and cognitive space. In this talk, the facial sketch-photo recognition, which is widely applied in forensic evidence, is taken as an example to introduce a new PR system in a combined cyber-physical-cognitive space. The main content includes three parts: graphical model (GM) framework, GM-based photo-sketch synthesis, and GM-based photo-sketch recognition
Xinbo Gao received the B.Eng., M.Sc., and Ph.D. degrees in signal and information processing from Xidian University, Xi'an, China, in 1994, 1997, and 1999, respectively. From 1997 to 1998, he was a Research Fellow at the Department of Computer Science, Shizuoka University, Shizuoka, Japan. From 2000 to 2001, he was a Post-doctoral Research Fellow at the Department of Information Engineering, the Chinese University of Hong Kong, Hong Kong. Since 2001, he has been at the School of Electronic Engineering, Xidian University. He is currently a Cheung Kong Professor of Ministry of Education, a Professor of Pattern Recognition and Intelligent System, and the Director of the State Key Laboratory of Integrated Services Networks, Xi'an, China. His current research interests include multimedia analysis, computer vision, pattern recognition, machine learning, and wireless communications. He has published six books and around 200 technical articles in refereed journals and proceedings. Prof. Gao is on the Editorial Boards of several journals, including Signal Processing (Elsevier), and Neurocomputing (Elsevier). He served as the General Chair/Co-Chair, Program Committee Chair/Co-Chair, or PC Member for around 30 major international conferences. He is a fellow of the IET/IEE, a follow of CIE, and Senior Member of IEEE.
The Third Research Institute of the Ministry of Public Security, China
Artificial Intelligence Demands and Challenges in Public Security
Chuanping Hu is a research fellow and the director of the Third Research
Institute of the Ministry of Public Security, China. He is also a specially-
appointed professor and Ph.D. supervisor in Shanghai Jiao Tong University
and Tsinghua University, China. He is the chairman of ACM Shanghai branch.
He has published more than 20 papers, edited 5 books, and got more than 30
authorized patents. His research interests include machine learning, computer
vision, and intelligent transportation system.
Learning Sequences: image caption with region-based attention and scene factorization
Learning sequence is a challenge task. Recent progress on automatic generation of image captions has shown that it is possible to describe the most salient information conveyed by images with accurate and meaningful sentences. In this talk, we introduce some models for sequence modeling. Then we introduce our image caption system that exploits the parallel structures between images and sentences. In our model, the process of generating the next word, given the previously generated ones, is aligned with the visual perception experience where the attention shifting among the visual regions imposes a thread of visual ordering. This alignment characterizes the flow of "abstract meaning", encoding what is semantically shared by both the visual scene and the text description. Our system also makes another novel modeling contribution by introducing scene-specific contexts that capture higher-level semantic information encoded in an image. The contexts adapt language models for word generation to specific scene types. We benchmark our system and contrast to published results on several popular datasets. We show that using either region-based attention or scene-specific contexts improves systems without those components. Furthermore, combining these two modeling ingredients attains the state-of-the-art performance.
Changshui Zhang, born in July 1986, graduated from the Department of mathematics, and received a bachelor's degree Peking University in 1965. I received a doctorate in July 1992 graduated from the Department of automation, Tsinghua University. From July 1992 to now, I work in the Department of automation, Tsinghua University as professor, doctoral tutor. My main research interests include: machine learning, pattern recognition, computer vision, etc.. I am currently the senior member of computer society; as academic journal "pattern recognition", "computer journal", "Journal of automation" editorial board; in international journals published more than 100 papers. In top conferences published more than 50 papers.
Information Forensics & Security (including biometrics)
Dalian University of Technology
Steganalysis of the Big Data over Network
There is a battle between steganography and steganalysis. In recent years, most steganalysis methods achieve quite high accuracy in recent years, benefited from the developments of machine learning. Does steganalysis really win the battle? We say NO because of the big data over network. The data over network comes from multiple sources and different quality, unbalance data and multiple actors, which make the high-performance steganalysis methods in the laboratory become degradation, even invalid. Then a new challenge of steganalysis comes for the big data over network. In this talk, digital image steganalysis, which has been widely studied in laboratory, is taken as an example to introduce the dramatic degradation in the performance sense, by analyzing the attribute of big data over network. Several trends of steganalysis of big data over network are summarized. Furthermore, we show our works on mismatch steganalysis via transfer learning and homogenous analysis.
Xiangwei Kong is a Professor of School of Information and Communication Engineering at Dalian University of Technology, and Director of Multimedia Security and Information Processing Lab at the Dalian University of Technology. She received her PhD from Dalian University of Technology in 2003, and B.E. and M.E. from the Harbin Engineering University in 1985 and 1988, respectively. She was a Visiting Professor at the Center for Education and Research in Information Assurance and Security (CERIAS) of Purdue University from September 2006 to September 2007, and a Senior Research Scientist at New York University from December 2014 to June 2015. She has served as council member of Academic Committee of the China Society of Image and Graphics, and vice chairman of the Multimedia Information Security Committee of the China Institute of Electronics. She obtains the Second prize of National Science and Technology Progress Award and others. She has published over 100 papers on top journals and conferences, including more than 50 SCI/EI/ISTP papers, and holds 12 patents. Her researches mainly focus on multimedia security and forensics, content-based image retrieval and mining, multimedia semantic understanding of big data and so on.
Beijing Jiaotong University
Reversible Data Hiding
Data hiding offers a way to embed data into cover medium for the purposes of ownership protection, authentication, ﬁngerprinting, secret communication and annotation, etc. In most data hiding algorithms, the cover data is destroyed permanently and cannot be exactly restored after the embedded message is extracted. Recently, a new data hiding technique, namely, reversible data hiding (RDH), is proposed, in which both the cover data and the embedded message can be extracted from the marked content. This speciﬁc data hiding technique has been found to be useful in the military, medical and legal ﬁelds, where the recovery of the original content is required after data extraction.
In the talk, we will first introduce the concept, the basic principle and implementation of the RDH. Then we will survey the state-of-the-art. Finally, we will present some relative works in our lab.
Professor Yao ZHAO got his Bachelor of Engineering from Department of Radio Engineering, Fuzhou University in 1989, and his Master degree in Engineering from Department of Radio Engineering, Southeast University in 1992, then got his Ph. D from Institute of Information Science, Beijing Jiaotong University in 1996. After that, he stayed and worked as a lecturer in the institute. He was promoted to associate professor in 1998, and exceptionally promoted to professor in 2001; in 2002 he was elected as a Ph.D supervisor. He worked as a Postdoc researcher at the Delft University of Technology in the Netherlands from 2001 to 2002. He visited EPFL in Switzerland as a senior visiting scholar in October 2015. Currently, he severs as the director of the Institute of Information Science of Beijing Jiaotong University, and the director of the Beijing Key Laboratory of Advanced Information Science and Network Technology, the PI of the State Key Laboratory of Rail Traffic Control and Safety, and the leader of Changjiang Scholars and Innovative Research Team. He is also a member of the Academic Degree Committee of the Beijing Jiaotong University, the Chairman of Professor Committee of the Computer and Information Technology College.
His research interests include image/video coding, digital watermarking and forensics, and video analysis and understanding. He is currently leading several national research projects from the 973 Program, 863 Program, and the National Science Foundation of China. He serves on the editorial boards of several international journals, including as an Associate Editor of the IEEE Transactions on Cybernetics, Associate Editor of the IEEE Signal Processing Letters, Area Editor of Signal Processing: Image Communication (Elsevier), and Associate Editor of Circuits, System, and Signal Processing (Springer). He was named a Distinguished Young Scholar by the National Science Foundation of China in 2010, and was elected as a Chang Jiang Scholar of Ministry of Education of China in 2013. He is an IET Fellow and an IEEE senior member.
Signal and Data Science for Bioinformatics, Neuroscience, and Bio/Medicine
CAS Institute of Automatic
Human Brainnetome Atlas and its Potential Applications in Brain-inspired Computing
Brain atlas is considered to be the cornerstone of basic neuroscience and clinical researches. Human brainnetome atlas is constructed with brain connectivity profiles. It is in vivo, with finer-grained brain subregions, and with anatomical and functional connection profiles. Using the human brainnetome atlas, researchers could simulate and model brain networks using informatics and simulation technologies to elucidate the basic organizing principles of the brain. Others could use this same atlas to design novel neuromorphic systems that are inspired by the architecture of the brain. Therefore, this cutting-edge human brainnetome atlas paves the way for constructing an even more fine-grained atlas of the human brain and offers the potential for applications in brain-inspired computing. In this lecture, we will summarize the advance of the human brainnetome atlas and its potential applications in brain-inspired computing. We first give a brief introduction on the history of the brain atlas development. Then we present the basic ideas of the brainnetome atlas and the procedure to construct this atlas. After that, some parcellation results of representative brain areas will be presented. We also give a brief presentation on how to use the brainnetome atlas to address issues in neuroscience and clinical research. Finally, we will give a brief perspective on the potential inspiration of the human brainnetome atlas for compter science.
Tianzi Jiang is Professor and Director of Beijing Key Laboratory of Brainnetome, Director of the Brainnetome Center, the Institute of Automation of the Chinese Academy of Sciences, the core member of CAS Center for Excellence in Brain Science and Intelligence Technology, and Professor of Queensland Brain Institute, University of Queensland. His research interests include neuroimaging, brainnetome, imaging genetics, and their clinical applications in brain disorders. He is the author or co-author of over 200 reviewed journal papers in these fields and the co-editor of six issues of the Lecture Notes in Computer Sciences. He is Associate Editor of IEEE Transactions on Cognitive and Developmental Systems, and Frontiers in Neuroinformatics,and Section Editor of BMC Neuroscience.
Reading the Underlying Information from Massive Metagenome Sequencing Data
Microorganisms are everywhere. Recent studies showed that the mixture of microbes or the microbiome on the human body plays important roles in human physiology and diseases. Metagenome sequencing is a key technology for studying microbiomes. It produces massive amounts of data in the form of short sequencing reads. A single metagenome sample can contain millions to trillions of reads of about 100 nucleotide long each. They contain rich information about microbiomes and their functions, but reading out those information from the huge data has multiple challenges for mathematical models, bioinformatics methods and computer algorithms. In this talk, I will share an overview of the information processing tasks in this field and observations from our own practices.
Dr. Xuegong Zhang earned his BS degree in Industrial Automation in 1989 and Ph.D. degree in Pattern Recognition and Intelligent Systems in 1994, both from Tsinghua University. He joined the faculty of Tsinghua University in 1994, where he is a Professor of Pattern Recognition and Bioinformatics now. Dr. Zhang worked at Harvard School of Public Health as a visiting scientist on computational biology in 2001-2002 and in 2006, and had been a visiting scholar in University of Southern California in 2007. Currently he is the Director of the Bioinformatics Division, Tsinghua National Laboratory for Information Science and Technology (TNLIST). His research interests include pattern recognition and machine learning, biological data mining especially for gene expression and alternative splicing analysis, metagenomic data analysis methods and applications, etc.
Multi-channel SP, Remote Sensing and Data Processing
Northwest Poly Univ.
Hyperspectral Remote Sensing Data Processing --- Signal and Information Processing View
In recent years, as evolutional development of remote sensing, hyperspectral remote sensing (HSRS) has become a core area within the remote sensing information acquisition and processing community and has attracted great increasing attention from wide communities, such as signal and image processing, electronic and optical imaging, sensors and automation, automatically target recognition (ATR), computer vision and pattern recognition, machine learning, communication, and computer application etc. Hyperspectral data or HSRS data is typically composed of hundred spectral measurements for each spatial element of an imaged scene. With HSRS data, we can detect and recognize difficult targets appearing at a group pixels, pixel or even subpixel level, classify targets with greatly improved accuracy, analyze and identify combined or mixed objects in a pixel or subpixel.
Signal processing is normally “waveform” based operation and Information processing is often referred as “content” based operation. Signal and information processing is a “waveform & content” based processing. This talk will, from signal and information processing view, comprehensively overview techniques of hyperspectral remote sensing data processing, including preprocessing (data calibration, correction, enhancement, fusion, compression, etc.), data/signal representation, dimensionality reduction, feature mining, classification/recognition, unmixing analysis, fast computation, etc., with emphases on HSRS data classification with both spatial and spectral information. Moreover, the hot topics in signal and information processing for HSRS data are also outlined. Especially, a framework to integrate spectral, spatial and temporal information by using hyperspectral imaging sensors, video cameras, multiview cameras for more complicated target detection problems will be introduced, which is expected to insight into more valuable research and potential new applications in the area.
Mingyi He was born in Yanting(Mianyang), Sichuan Province. He obtained his Bachelor and Master Degrees from Northwestern Polytechnical Univ. (NPU) respectively in 1982 and 1985 and PhD from Xidian Univ. in 1994. He has been with the School of Electronics and Information, NPU, where he is a Professor of signal and information processing. He is the Founder and Director of Shaanxi Key Laboratory of Information Acquisition and Processing and the Director and Chief Scientist of the Center for Earth Observation Research. He was a visiting scholar and visiting professor of the university of Adelaide, Australia and a visiting professor with the university of Sydney, Australia. He has published more than 300 papers in the IEEE Trans. on Pattern Analysis and Machine Intelligence, Geoscience and Remote Sensing, International Journal of Computer Vision, Signal Processing, ICIP, CVPR, etc. He has made valuable contributions to hyperspectral image processing, computer vision and image processing, neural networks and intelligent information processing with notable applications to X-ray image processing for luggage inspection and laser-finder test systems for airborne systems, etc. Prof. He has been a member of the Advisor Committee of China National Council for Higher Education on Electronics and Information, a member of NSFC reviewing expert group and the Chinese Lunar Exploration Expert Group, the Vice-President of Shaanxi Institute of Electronics, the Vice-Director of the Spectral Imaging Earth Observation Committee of China Committee of International Society of Digital Earth, the Vice-President of the Space Remote Sensing Society (CSA), and the Vice-Chairman of IET Xi’an Network. He was General Cochair of IEEE ICIEA 2009, the TPC Cochair of ICIEA 2013, and the General Chair of ChinaSIP 2014. He is currently a member of IEEE SPS’s ChinaSIP steering committee. He was the recipient of the IEEE CVPR 2012 Best Paper Award, was recognized as the “2012 Chinese Scientist of the Year,” and was awarded ten scientific prizes from China and the governmental lifelong subsidy for outstanding contribution to higher education and scientific research by the State Council of China since 1993.
Cognitive Radar：Adaptive Sensing with Prior Information
Cognitive radar, which is also called as fully adaptive radar, has been received more and more attention in radar community. The basic idea of cognitive radar is to utilize the prior information, including target and circumstance information, to improve radar performance. There are two main research directions in cognitive radar. The first is how to use the prior information to optimize the transmitted waveform, beampattern, and resource allocation, etc., which we call it as cognitive transmitting. The second is how to use the prior information to improve the clutter suppression performance, target detection performance, etc., especially in the nonhomogeneous circumstance, which we call it as cognitive receiving. In this talk, our recent progress on cognitive radar will be presented. After a brief review of cognitive radar, a waveform optimization framework based on prior information and particular task constrains, with an application example of resource allocation for target tracking, will be introduced. And a novel full adaptive Space-Time-Adaptive-Processing (STAP) with online clutter sensing will be discussed to show the potential of performance improvement of cognitive radar.
Hongwei Liu, was born in Mengjin, Henan, in March 1971. He received the B.S. degree from Dalian University of Technology, Dalian, China, in 1992; the M.S. and Ph.D. degrees from Xidian University, Xi'an, China, in 1995 and 1999 respectively. From Jan. 2001 to Oct. 2002, he worked as a visiting scholar at the department of electrical and computer engineering of Duke University, Durham, USA. Dr. Liu is currently a Professor in the department of electronic engineering of Xidian University, he also is the director of the National Laboratory of Radar Signal Processing. His research interesting include radar automatic target recognition, wideband radar signal processing, netted radar, adaptive and array signal processing, cognitive radar, etc. He has published over 200 papers in refereed journals and proceedings.
CAS Institute of Acoustics
Unified framework for optimal design of sensor array beamformer
Beamforming is one of the most important tasks in array signal processing, which has wide applications in sonar, radar, wireless communications, astronomy, seismology, medical imaging, etc. There are a number of performance measures by which one assesses the capabilities of an beamformer, such as array gain, directivity index, robustness, sidelobe level, mainlobe width, and so on. We have formulated the beamformer weight vector design problem as a multiply constrained problem, so that the resulting beamformer can provide a suitable trade-off among multiple conflicting performance measures. Our multiply constrained approach can include most of the existed beamformer design methods as special cases, which leads to very flexible designs and gives a new perspective to fully understand the performances of a beamformer. This multiply constrained approach can be extended to the wideband case, as well as to the modal array case. In general, most of the optimal beamformer design problems can be included in a unified framework, which can be solved efficiently using a convex optimization solver.
Shefeng Yan received the B.Sc., M.Sc., and Ph.D. degrees in electrical engineering from Northwestern Polytechnical University, Xi’an, China, in 1999, 2001, and 2005, respectively.
He was a Postdoctoral Research Associate with the Institute of Acoustics, Chinese Academy of Sciences (IACAS), Beijing, China, from 2005 to 2007, and with the Department of Electronics and Telecommunications, Norwegian University of Science and Technology, Trondheim, Norway, from 2007 to 2009, respectively. Since 2009, he has been a Professor with IACAS, where he is currently the Director of the Key Laboratory of Information Technology for Autonomous Underwater Vehicles, Chinese Academy of Sciences. He was a Senior Visiting Scholar at the Chair of Multimedia Communications and Signal Processing (LMS), University of Erlangen-Nuremberg, Germany in 2015. He is the author of the book Sensor Array Beampattern Optimization: Theory with Applications (Science Press, 2009), in addition to over 100 journal and conference papers. His current research interests include statistical and array signal processing, and their applications.
Prof. Yan is a recipient of the National Program for Support of Top-notch Young Professionals, as well as a recipient of the Excellent Young Scientist Fund of China. He is also a recipient of the Chinese 2008 National Excellent Doctoral Dissertation Award. He is a recipient of the 2010 ICA-ASA Young Scientist Grant for excellent contributions to Acoustics and a co-recipient of the Best Paper Awards at SENSORCOMM 2008 and WASPAA 2009.
Emerging Cross-cutting Topics: Deep Learning for Big Data Analytics
Nanyang Tech Univ. - Singapore
Context modelling via deep feature learning
Context modelling is an important topic of computer vision. Existing methods usually conduct context modelling in a post processing manner via graphical models such as CRF. Such methods might be very slow due to the process of energy minimization. In our works, we aim to model context at the feature extraction stage by learning context-aware local features. The talk will contain two parts. In the first part, I will introduce how we adapt Recurrent Neural Networks to 2-D images for effective recurrent feature learning; in the second part, I will introduce an episodic memory model which can iteratively engage long-term contextual dependency. State-of-the-art performance is achieved on benchmark scene labeling datasets.
Gang Wang is an adjunct research scientist who worked full time at ADSC until November 2015. He is also an Assistant Professor at Nanyang Technological University. He received his bachelor's degree from the Harbin Institute of Technology in 2005, and his Ph.D. degree at the University of Illinois at Urbana-Champaign in 2010. He is a recipient of the Harriett & Robert Perry fellowship and the CS/AI award at the University of Illinois at Urbana-Champaign.
Research Interests: Computer vision and machine learning.
Chinese Univ. of Hong Kong
Interpreting Neural Semantics in Deep Models
Deep learning has achieved great success in computer vision. Many people believe that the success is due to employing a huge number of parameters to fit big training data. In this talk, I will show that neuron responses of deep models have clear semantic interpretation, which is supported by our research on multiple fields of face recognition, object tracking, human pose estimation, and crowd video analysis. In particular, the responses of neurons in the top layers have sparseness and strong selectiveness object classes, attributes and identities. Sparseness and selectiveness are strongly correlated. Such selectiveness is naturally obtained through large scale training without adding extra regularization during the training process. By understanding neural semantics, we are inspired to develop new network architectures and training strategies and they effectively improve a broad range of applications in face recognition, face detection, compressing neural networks, object tracking, learned structured feature representation in human pose estimation, and effectively learning dynamic feature representations of different semantic units in video understanding.
Xiaogang Wang received his Bachelor degree in Electronic Engineering and Information Science from the Special Class of Gifted Young at the University of Science and Technology of China in 2001, M. Phil. degree in Information Engineering from the Chinese University of Hong Kong in 2004, and PhD degree in Computer Science from Massachusetts Institute of Technology in 2009. He is an associate professor in the Department of Electronic Engineering at the Chinese University of Hong Kong since August 2009. He received the Outstanding Young Researcher in Automatic Human Behaviour Analysis Award in 2011, Hong Kong RGC Early Career Award in 2012, and Young Researcher Award of the Chinese University of Hong Kong. He is the associate editor of the Image and Visual Computing Journal. He was the area chair of ICCV 2011, ICCV 2015, ECCV 2014 and ACCV 2014. His research interests include computer vision, deep learning, crowd video surveillance, object detection, and face recognition.
Emerging Cross-cutting Topics: Internet of the Things
Nanjing University of Posts & Telecommunications
SDN and NFV Cooperation for 5G Communication System
Network virtualization is the important technology in the new generation network architecture. Software defined network and (SDN) and network function virtualization (NFV) are the representative technology of network virtualization. The cooperation of SDN and NFV can provides flexible networking for various kinds of network usage demand. In this talk, the R&D activities of network virtualization and its methods are introduced. Then, we discuss the methods of the cooperation of SDN and NFV for 5G communication system.
Professor Zhu Hongbo, male, Han nationality, is currently supervisor of doctoral degree candidate. He was born in February 1956 in Yangzhou, Jiangsu Province. After graduating from School of Telecommunication Engineering at NUPT and was granted Bachelor of Science in 1982, Professor Zhu remained to teach there. In 1996, Professor Zhu graduated from School of Telecommunication Engineering at Beijing University of Posts and Telecommunications (BUPT) and got his Ph.D degree. He served as Deputy Dean of Electronic Engineering at NUPT from July 1997 to June 1999, Head of the Graduate School at NUPT from July 1999 to July 2005, Head of the School of Communication and Information Engineering at NUPT from July 2005 to April 2009, Head of Communication Technology Research Institute at NUPT from 2006 until present. In June 2008, he began to serve as Vice President of NUPT.
Microsfot Research - Asia
Empowering Human Centric Video Understanding
Video is the biggest big data that contains an enormous amount of information. At Microsoft Research Asia, we are leveraging computer vision and deep learning to develop a cloud-based intelligence engine that can turn raw video data into insights to facilitate various applications and services. In this talk, I will introduce our recent effort on human centric video analysis and present some latest technologies we have developed including online learning based face/human tracking/identification/redaction, skeleton-based human action recognition, and real-time human action detection and forecasting in streaming video, etc. I will also shed some light on the go-to-market aspect of this intelligent cloud effort.
Wenjun (Kevin) Zeng is a Principal Research Manager overseeing the Internet Media Group and the Media Computing Group at Microsoft Research Asia, while on leave from the Univ. of Missouri (MU) where he is a Full Professor. He had worked for PacketVideo Corp., Sharp Labs of America, Bell Labs, and Panasonic Technology prior to joining MU in 2003. Wenjun has contributed significantly to the development of international standards (ISO MPEG, JPEG2000, and OMA). He received his B.E., M.S., and Ph.D. degrees from Tsinghua Univ., the Univ. of Notre Dame, and Princeton Univ., respectively. His current research interest includes mobile-cloud media computing, computer vision, social network/media analysis, multimedia communications, and content/network security.
He is a Fellow of the IEEE. He is an Associate Editor-in-Chief of IEEE Multimedia Magazine, and was an AE of IEEE Trans. on Circuits & Systems for Video Technology (TCSVT), IEEE Trans. on Info. Forensics & Security, and IEEE Trans. on Multimedia (TMM). He is/was on the Steering Committee of IEEE Trans. on Mobile Computing (current) and IEEE TMM (2009-2012). He served as the Steering Committee Chair of IEEE ICME in 2010 and 2011, and has served as the TPC Chair of several IEEE conferences (e.g., ChinaSIP’15, WIFS’13, ICME’09, CCNC’07). He will be a general co-Chair of ICME2018. He is currently guest editing an IEEE Communications Magazine Special Issue on Impact of Next-Generation Mobile Technologies on IoT-Cloud Convergence and a TCSVT Special Issue on Visual Computing in the Cloud - Mobile Computing, and was a Special Issue Guest Editor for the Proceedings of the IEEE, IEEE TMM, and ACM TOMCCAP.
Linearized Alternating Direction Methods for Solving Constrained Optimization Problems
In machine learning and signal processing, we are often faced with constrained optimization problems. The linearized alternating direction method (LADM) is a convenient approach to solve various optimization problems. In this tutorial, I will introduce the basics of LADM and its variations and give examples of applying LADM.
Zhouchen Lin received the Ph.D. degree in applied mathematics from Peking University in 2000. He is currently a Professor at Key Laboratory of Machine Perception (MOE), School of Electronics Engineering and Computer Science, Peking University. He is also a Chair Professor at Northeast Normal University and a guest professor at Beijing Jiaotong University. He was a guest professor at Shanghai Jiaotong University and Southeast University, and a guest researcher at Institute of Computing Technology, Chinese Academy of Sciences. His research areas include image processing, computer vision, pattern recognition, machine learning, and numerical optimization He is an associate editor of IEEE T. Pattern Analysis and Machine Intelligence and International J. Computer Vision and an area chair of CVPR 2014, ICCV 2015, NIPS 2015, AAAI 2016, CVPR 2016, and IJCAI 2016.
Zack Chase Lipton
University of California, San Diego
Learning to Diagnose with LSTM Recurrent Neural Networks
Clinical medical data, especially in the intensive care unit (ICU), consist of multivariate time series of observations. For each patient visit (or episode), sensor data and lab test results are recorded in the patient's Electronic Health Record (EHR). While potentially containing a wealth of insights, the data is difficult to mine effectively, owing to varying length, irregular sampling and missing data. Recurrent Neural Networks (RNNs), particularly those using Long Short-Term Memory (LSTM) hidden units, are powerful and increasingly popular models for learning from sequence data. They effectively model varying length sequences and capture long range dependencies. We present the first study to empirically evaluate the ability of LSTMs to recognize patterns in multivariate time series of clinical measurements. Specifically, we consider multilabel classification of diagnoses, training a model to classify 128 diagnoses given 13 frequently but irregularly sampled clinical measurements. First, we establish the effectiveness of a simple LSTM network for modeling clinical data. Then we demonstrate a straightforward and effective training strategy in which we replicate targets at each sequence step. Trained only on raw time series, our models outperform several strong baselines, including a multilayer perceptron trained on hand-engineered features.
I am a graduate student in the Artificial Intelligence Group at the University of California, San Diego. I work on deep learning methods and applications of machine learning. In particular, I work on sequential modeling with recurrent neural networks, and am especially interested in research impacting medicine and and natural language processing. Recently, in Learning to Diagnose with LSTM RNNs, we trained LSTM RNNs to accurately predict patient diagnoses using only lightly processed time series of sensor readings in the pediatric ICU.
In other work, I develop algorithms to exploit sparsity, for efficient large-scale multilabel learning on natural language datasets. Last year, my paper on optimally thresholding to maximize F1 was published in the proceedings of ECML 2014. In it, we mathematically characterize the optimal threshold to maximize the F1 metric. In the paper, we critically examine the complicated relationship between evaluation methodology and decision theory.
Before coming to UCSD, I completed a Bachelor of Arts with a joint major in Mathematics and Economics at Columbia University. Then, I worked in New York City as a jazz musician. In summer 2014, I interned at Microsoft Research in Bangalore and in 2015 I interned for Amazon's Core Machine Learning team. My research has been generously supported by UCSD department of biomedical informatics via an NLM biomedical informatics training grant and by hardware/cloud compute donations from NVIDIA and Amazon.