Tensorflow 0.12, Ubuntu16.04, CUDA 8

Task: install Tensorflow framework on Ubuntu 16.04 with CUDA 8.0

Update the system

  • Install build essentials:
    • sudo apt-get install build-essential
  • Install latest version of kernel headers:
    • sudo apt-get install linux-headers-uname -r

Install CUDA

  • Install curl (for the CUDA download):
    • sudo apt-get install curl
  • Download CUDA 8.0 to Downloads folder
  • Make the downloaded installer file runnable:
    • chmod +x cuda_8.0.44_linux.run
  • Run the CUDA installer:
    • sudo ./cuda_8.0.44_linux.run --kernel-source-path=/usr/src/linux-headers-`uname -r`/
      • Accept the EULA
      • Do NOT install the graphics card drivers (since we are in a virtual machine)
      • Install the toolkit (leave path at default)
      • Install symbolic link
      • Install samples (leave path at default)
  • Update the library path
    • echo 'export PATH=/usr/local/cuda/bin:$PATH' >> ~/.bashrc
    • echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/lib' >> ~/.bashrc
    • source ~/.bashrc 

Install pip

  • Install pip:
    • sudo apt-get install python-pip python-dev

Install Tensorflow

  • Install Tensorflow:
    • pip install tensorflow

Test Tensorflow:

  • Test Tensorflow:
    • python
    • import tensor flow as tf

Congralutations!!! You have successfully installed Tensorflow into Ubuntu 16.04 with CUDA 8.0. Let’s enjoy it now.

Computer Vision, Image Processing and Pattern Recognition Conference and Journal List to target

Computer Vision, Image Processing and Pattern Recognition Conference List

Screen Shot 2016-06-09 at 11.58.52 am

Submission dates

Screen Shot 2016-06-09 at 11.33.41 am

Computer Vision, Image Processing and Pattern Recognition journals

Screen Shot 2016-06-09 at 11.50.13 am

 

CVPR: IEEE Conference on Computer Vision and Pattern Recognition

ECCV: European Conference on Computer Vision

ICCV: IEEE International Conference on Computer Vision

NIPS: Annual Conference on Neural Information Processing Systems

ICIP: IEEE International Conference on Image Processing

ICLR: International Conference on Learning Representations

ICPR: International Conference on Pattern Recognition

CVPRW: IEEE Conference on Computer Vision and Pattern Recognition Workshops

ACCV: Asian Conference on Computer Vision

AVSS: International Conference on Advanced Video and Signal- Based Surveillance

WACV: IEEE Winter Conference on Applications of Computer Vision

DICTA: International Conference on Digital Image Computing: Techniques and Applications

BIOSIG: International Conference of the Biometrics Special Interest Group

IVCNZ: Image and Vision Computing New Zealand Conference

Deep learning spatio-temporal descriptors for videos

The major try is to incorporate spatial and temporal of videos for representation.

1. 3D CNN

C3D [project][paper] D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri, Learning Spatiotemporal Features with 3D Convolutional Networks, ICCV 2015.

[paper] Ji, W. Xu, M. Yang and K. Yu, 3D Convolutional Neural Networks for Human Action Recognition, TPAMI 2013

[project] [paper]Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, Li Fei-Fei, Large-scale Video Classification with Convolutional Neural Networks, CVPR 2014.

2. Two streams: one CNN for spatial, one for temporal (usually optical flow)

[Paper] Karen Simonyan, Andrew Zisserman, Two-Stream Convolutional Networks for Action Recognition in Videos, NIPS 2014.

[Project] [Paper] G. Gkioxari and J. Malik, Finding action tubes, in CVPR, 2015.

3. CNN+LSTM: LRCN [project] [paper]

Jeff Donahue and Lisa Anne Hendricks and Sergio Guadarrama and Marcus Rohrbach and Subhashini Venugopalan and Kate Saenko and Trevor Darrell, Long-term Recurrent Convolutional Networks for Visual Recognition and Description, CVPR 2015.

Zhen Zuo1, Bing Shuai1, Gang Wang1,2, Xiao Liu1, Xingxing Wang1, Bing Wang1, Yushi Chen, Convolutional Recurrent Neural Networks: Learning Spatial Dependencies for Image Representation, CVPR 2015.

Ming Liang Xiaolin Hu, Recurrent Convolutional Neural Network for Object Recognition, CVPR 2015.

Pedro O. Pinheiro and Ronan Collobert, Recurrent Convolutional Neural Networks for Scene Labeling, ICML 2014.

 

4. CNN+GRU: GRU-RCN [paper]

Nicolas Ballas, Li Yao, Chris Pal, Aaron Courville, Delving Deeper into Convolutional Networks for Learning Video Representations, ICLR 2016.

5. Grid LSTM:

Nal Kalchbrenner, Ivo Danihelka, Alex Graves, Grid Long Short-Term Memory, ICLR 2016.

——————————————————————————

Considering a large number of applications in video in SAIVT lab (action recognition, abnormal event detection, facial expression, VQA), it’s worthwhile investigating an effective patio-temporal video descriptor. In the literature, researchers are trying different approaches to get temporal information in the representation:

  1. 3D CNN: temporal information is in the form of the third dimension of the filter kernels.
  2. Parallel two CNNs: one for spatial and one for temporal (optical flow).
  3. CNNs followed by RNNs (LSTM,GRU):

I have 2 ideas to explore here:

  1. Similar to our idea in CNN+CRF, the initial idea is to combine CNN and RNN in one equation. This will save the computing from 3 optimisation approaches, incorporating the better constraints for representation.
  2. Pure RNN network: modify RNN network to incorporate both spatial and temporal information. RNN is essentially good at representing temporal information. How to represent spatial? Recently there is a paper prove 2D LSTM is equivalent to CNN. That may be the clue.

Another way to look at the above approaches is: 3D CNNs encode temporal locally and RNNs encode temporal globally. It depends on the applications to choose the appropriate local or global approaches. For abnormal event detection of individual, the local approach is critical. Otherwise in case of group, the global approach is the key.

 

 

Caffe installation guide

Task: install Caffe framework on Ubuntu 15.04 with CUDA 7.5

Link to Kubuntu 15.04: kubuntu-15.04-desktop-amd64.iso

Link to CUDA 7.5: Download (1.1 GB)

 

Update the system

  • Install build essentials:
    • sudo apt-get install build-essential
  • Install latest version of kernel headers:
    • sudo apt-get install linux-headers-`uname -r`

Install CUDA

  • Install curl (for the CUDA download):
    • sudo apt-get install curl
  • Download CUDA 7.5 to Downloads folder
  • Make the downloaded installer file runnable:
    • chmod +x cuda_7.5.18_linux.run
  • Run the CUDA installer:
    • sudo ./cuda_7.5.18_linux.run --kernel-source-path=/usr/src/linux-headers-`uname -r`/
      • Accept the EULA
      • Do NOT install the graphics card drivers (since we are in a virtual machine)
      • Install the toolkit (leave path at default)
      • Install symbolic link
      • Install samples (leave path at default)
  • Update the library path
    • echo 'export PATH=/usr/local/cuda/bin:$PATH' >> ~/.bashrc
    • echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/lib' >> ~/.bashrc
    • source ~/.bashrc

Install dependencies

  • sudo apt-get install -y libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libboost-all-dev libhdf5-serial-dev protobuf-compiler gfortran libjpeg62 libfreeimage-dev libatlas-base-dev git python-dev python-pip libgoogle-glog-dev libbz2-dev libxml2-dev libxslt-dev libffi-dev libssl-dev libgflags-dev liblmdb-dev python-yaml
  • sudo easy_install pillow

Install Caffe

  • Download Caffe:
  • Install python dependencies for Caffe:
    • cd caffe
    • cat python/requirements.txt | xargs -L 1 sudo pip install
  • Add a couple of symbolic links for some reason:
    • sudo ln -s /usr/include/python2.7/ /usr/local/include/python2.7
    • sudo ln -s /usr/local/lib/python2.7/dist-packages/numpy/core/include/numpy/ /usr/local/include/python2.7/numpy
  • Create a Makefile.config from the example:
    • cp Makefile.config.example Makefile.config
    • Edit Makefile.config
      • Uncomment the line # CPU_ONLY := 1 (In a virtual machine we do not have access to the the GPU)
      • Under PYTHON_INCLUDE, replace /usr/lib/python2.7/dist-packages/numpy/core/include with /usr/local/lib/python2.7/dist-packages/numpy/core/include (i.e. add /local)
      • Add reference to hdf5
        INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/lib/x86_64-linux-gnu/hdf5/serial/include
        LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu/hdf5/serial
  • Add “opencv_imgcodecs” to the Makefile.
    • (LIBRARIES += glog gflags protobuf leveldb snappy \
      lmdb boost_system hdf5_hl hdf5 m \
      opencv_core opencv_highgui opencv_imgproc opencv_imgcodecs)
  • Compile Caffe:
    • make pycaffe
    • make all
    • make test

Congralutations!!! You have successfully installed Caffe into Ubuntu 15.04 with CUDA 7.5. Let’s enjoy it now.

You can play with it in python. But I prefer ipython notebook since it allows us to debug to have a deep insight into what happens in the code, not just a black box.

( In case ipython notebook runs into error, please run

For 4.0 and above You need to install the notebook app separately from https://github.com/jupyter/notebook

pip install jupyter

)


-------------------------

 

CAFFE on Ubuntu 16.04 CUDA8 OpenCV3.1

https://github.com/BVLC/caffe/wiki/Ubuntu-16.04-or-15.10-Installation-Guide

https://github.com/BVLC/caffe/wiki/Ubuntu-16.04-or-15.10-OpenCV-3.1-Installation-Guide

If there is error with  #error -- unsupported GNU version! gcc versions later than 5.3 are not supported!then the solution here:

https://github.com/BVLC/caffe/wiki/GeForce-GTX-1080,—CUDA-8.0,—Ubuntu-16.04,—Caffe

If there is error with graph cut then solution here:
http://answers.opencv.org/question/95148/cudalegacy-not-compile-nppigraphcut-missing/

 

 

Deep Learning resources

COURSES

1. Stanford – Fei Fei Li, Karpathy – Convolutional Neural Networks for Visual Recognition (CS231n)

http://cs231n.stanford.edu

2. New York University –  Yan Lecun – Deep Learning

http://cilvr.cs.nyu.edu/doku.php?id=courses:deeplearning2015:start

3. Virginia Tech – Deep Learning for Perception

https://computing.ece.vt.edu/~f15ece6504/

4. Toronto – G. Hilton – Neural Networks for Machine Learning

https://www.coursera.org/course/neuralnets/?action=enroll&sessionId=256

5. Toronto – Introduction to Neural Networks for Machine Learning

http://www.cs.toronto.edu/~tijmen/csc321/

6. Montreal – Bengio, Lecun – Deep learning summer school 2015

https://sites.google.com/site/deeplearningsummerschool/

7. CUHK – Xiaogang Wang- Introduction to Deep learning 2015

https://piazza.com/cuhk.edu.hk/spring2015/eleg5040/resources

8. Google – Deep learning 2016 using Tensorflow

https://www.udacity.com/course/deep-learning–ud730

9. Toronto – Deep learning in Computer Vision

http://www.cs.utoronto.ca/~fidler/teaching/2015/CSC2523.html#syllabus

10. Oxford – Nando de Freitas – Machine Learning

https://www.cs.ox.ac.uk/people/nando.defreitas/machinelearning/

 ———————————–

BOOKS

Free online books

1. Deep Learning – Ian Goodfellow, Yoshua Bengio and Aaron Courville

http://www.deeplearningbook.org

2. Neural Networks and Deep Learning – Michael Nielsen

http://neuralnetworksanddeeplearning.com

3. Deep Learning Tutorial – LISA lab, University of Montreal

http://deeplearning.net/tutorial/deeplearning.pdf

————————————————————————–

BEGINING

https://brohrer.github.io/how_convolutional_neural_networks_work.html

https://adeshpande3.github.io/adeshpande3.github.io/A-Beginner’s-Guide-To-Understanding-Convolutional-Neural-Networks/

An Intuitive Explanation of Convolutional Neural Networks

Convolutional Neural Networks (CNNs): An Illustrated Explanation

https://au.mathworks.com/campaigns/products/artifacts/deep-learning.html

———————————-

INTERMEDIATE

—————————–

ADVANCED

2. Deep learning school, Bay Area, 24-25/9/2016

http://www.bayareadlschool.org

Part 1: https://www.youtube.com/watch?v=eyovmAtoUx0

Part 2: https://www.youtube.com/watch?v=9dXiAecyJrY

1. Deep learning summer school, Montreal 1-7/8/2016

https://sites.google.com/site/deeplearningsummerschool2016/home

http://videolectures.net/deeplearning2016_montreal/

Deepnet frameworks

There are plenty of deepnet frameworks out there. Two most popular among academic are: Caffe and Torch. Both are very fast with a number of pre-trained models. Arguably, Caffe is more popular than Torch. However, Torch is backed up by Google and Facebook.

Caffe:

  • Key players:
  • Installation: https://github.com/BVLC/caffe/wiki/Ubuntu-14.04-VirtualBox-VM
  • Start point: http://caffe.berkeleyvision.org/gathered/examples/mnist.html

 

Torch:

  • Key players: Facebook, Google, Li Fei Fei (Stanford course)
  • Installation: http://torch.ch/docs/getting-started.html#_
  • Start point: http://torch.madbits.com/wiki/doku.php?id=start
  • Tutorial: https://github.com/soumith/cvpr2015/blob/master/cvpr-torch.pdf

TensorFlow:

  • Key players: Google
  • Installation:

 

 

If you just want to play around with pre-trained model, go for matlab matconvnets toolbox.

 

 

NeuralTalk set up/train/test

NeuralTalk – Deep Visual-Semantic Alignments for Generating Image Descriptions (CVPR’15) – Standford  Li Fei-Fei

NeuralTalk is a Python+numpy project for learning Multimodal Recurrent Neural Networks that describe images with sentences. It ranked 13rd in the current leaderboard of Microsoft Coco image caption challenge. NeuralTalk uses VGG features as input for Multimodal Recurrent Neural Networks.

The following sections are organised as:

  • 1. Installation
  • 2. Testing with pre-trained models
  • 3. Training our own models

 

1. Installation

NeuralTalk

MatConvNet: NeuralTalk uses VGG 4096-D features, so we will get this features using MatConvNet, which is a MATLAB toolbox implementing Convolutional Neural Networks (CNNs) for computer vision applications.

  • Download: http://www.vlfeat.org/matconvnet/download/matconvnet-1.0-beta13.tar.gz
  • Compile the toolbox:
    > cd <MatConvNet>
    > addpath matlab
    > vl_compilenn
  • Download VGG models:
    >run matlab/vl_setupnn
    
    % download a pre-trained CNN from the web
    >urlwrite('http://www.vlfeat.org/sandbox-matconvnet/models/imagenet-vgg-verydeep-19.mat', 'imagenet-vgg-verydeep-19.mat') ;
    >net = load('imagenet-vgg-f.mat') ;

 

2. Testing with pre-trained models

The code allows you to easily predict and visualize results of running the model on COCO/Flickr8K/Flick30K images. We can run the code on arbitrary image, things get a little more complicated because we need to first need to pipe your image through the VGG CNN to get the 4096-D activations on top. In this post, I will show how to calculate 4096-D VGG features using MatConvNet. Later I will add how to calculate VGG features using Caffe. Say we want to test 2 images saved in example_images folder:

1. Modify tasks.txt
Clear tasks.txt and add
“tennis.jpg
skiing.jpg

2. In MATLAB: Calculate VGG features using MatConvNet, then save to vgg_feats.mat file:

cd /home/kien/Documents/MATLAB/matconvnet-1.0-beta13
feats = [];

net = load(‘imagenet-vgg-verydeep-19.mat’) ;
% obtain and preprocess an image
im = imread(‘/home/kien/neuraltalk/neuraltalk/example_images/tennis.jpg’) ;
im_ = single(im) ; % note: 255 range
im_ = imresize(im_, net.normalization.imageSize(1:2)) ;
im_ = im_ – net.normalization.averageImage ;
% run the CNN
res = vl_simplenn(net, im_) ;
% show the classification result
scores = squeeze((res(end).x)) ;
[bestScore, best] = max(scores) ;
figure(1) ; clf ; imagesc(im) ;
title(sprintf(‘%s (%d), score %.3f’,…
net.classes.description{best}, best, bestScore)) ;
b=res(42).x;
b = squeeze(b);
feats(:,1) = b;

im = imread(‘/home/kien/neuraltalk/neuraltalk/example_images/skiing.jpg’) ;
im_ = single(im) ; % note: 255 range
im_ = imresize(im_, net.normalization.imageSize(1:2)) ;
im_ = im_ – net.normalization.averageImage ;
% run the CNN
res = vl_simplenn(net, im_) ;
% show the classification result
scores = squeeze((res(end).x)) ;
[bestScore, best] = max(scores) ;
figure(2) ; clf ; imagesc(im) ;
title(sprintf(‘%s (%d), score %.3f’,…
net.classes.description{best}, best, bestScore)) ;
b=res(42).x;
b = squeeze(b);
feats(:,2) = b;

save(‘/home/kien/neuraltalk/neuraltalk/example_images/vgg_feats.mat’,’feats’);

3. In TERMINAL:
Go to neuraltalk folder, type: python predict_on_images.py lstm_model.p -r example_images/

Open result.html in examples_images folder to see the result

 

3. Training our own models

If we want to train the model, download the data to /data folder from:

Run the training $ python driver.py

Relax and waiting now…

Will dig into driver.py file later