computer vision - Is it possible to train an SVM or Random Forest on the final layer feature of a Convolutional Neural Network using Keras?

I have designed a Convolutional Neural Network in Keras for image classification with several convolution/max-pooling layers, one densely connected hidden layer and softmax activation on the final layer. I want to replace softmax with an SVM or Random Forest in the final layer to see if that yields a better accuracy. Is there any way to do it in Keras?...Read more

computer vision - Deep Learning: Using a pretrained network's earlier activations

I have around 25k images belonging to 14 different classes (different kinds of neck-lines, e.g. v-neck, round neck, etc). The images mainly contain the the top-part of the apparel and/or the face of the model. Here are some examples:In order to do this, I thought of extracting the features after the 1st block of VGG16 (pretrained on imagenet) because the feature map of the earlier blocks will be capturing things lines, shapes, etc. Here is the model.summary():Layer (type) Output Shape Param # ====================...Read more

computer vision - Why scale pixels between -1 and 1 sample-wise in the preprocess step for image classification

In the preprocess_input() function found at the link below, the pixels are scaled between -1 and 1. I have seen this used elsewhere as well. What is the reason for scaling between -1 and 1 as opposed to 0 and 1. I was under impression that common ranges for pixels where between 0-255 or if normalized 0-1.https://github.com/keras-team/keras/blob/master/keras/applications/imagenet_utils.py...Read more

computer vision - Convolutional Neural Networks - Multiple Channels

How is the convolution operation carried out when multiple channels are present at the input layer? (e.g. RGB)After doing some reading on the architecture/implementation of a CNN I understand that each neuron in a feature map references NxM pixels of an image as defined by the kernel size. Each pixel is then factored by the feature maps learned NxM weight set (the kernel/filter), summed, and input into an activation function. For a simple grey scale image, I imagine the operation would be something adhere to the following pseudo code:for i in r...Read more

computer vision - Calculation of corner points for the localization of robot in 3D data

After segmenting out subset of a pointcloud that fitted using pcl::SACMODEL_LINE RANSAC line segmentation module. In the next step center point of extracted point cloud is computed using pcl::compute3DCentroid(point_cloud, centroid);Which gives accurate center point until the camera and the extracted line model object are parallel to each other. In the last step the corner points of the extracted point cloud i.e a fitted line are calculated by the addition of known distance on the centerpoint to calculate the corner points. This technique will...Read more

computer vision - output feature map dimension of VGG16 model

I saw the example of feature extraction in the keras doc and used the following code to extract feature from input imageinput_shape = (224, 224, 3)model = VGG16(weights = 'imagenet', input_shape = (input_shape[0], input_shape[1], input_shape[2]), pooling = 'max', include_top = False)img = image.load_img(img_path, target_size=(input_shape[0], input_shape[1]))img = image.img_to_array(img)img = np.expand_dims(img, axis=0)img = preprocess_input(img)feature = model.predict(img)Then when I output the shape of the feature variable, I found it is (1, ...Read more

computer vision - Comparison of HoG with CNN

I am working on the comparison of Histogram of oriented gradient (HoG) and Convolutional Neural Network (CNN) for the weed detection. I have two datasets of two different weeds.CNN architecture is 3 layer network.1) 1st dataset contains two classes and have 18 images.The dataset is increased using data augmentation (rotation, adding noise, illumination changes)Using the CNN I am getting a testing accuracy of 77% and for HoG with SVM 78%.2) Second dataset contact leaves of two different plants. each class contain 2500 images without data augmen...Read more

computer vision - Matching features of images using PCA-SIFT

I want to match features in two images to detect copy-move forgery. I used the PCA-SIFT code to detect image features. But, I am having trouble in matching the PCA-SIFT features. According to several papers, similar matching process is used for PCA-SIFT as is used in SIFT. I have used the following code snippet to match features.%des1 and des2 are the PCA-SIFT descriptors obtained from two images% Precompute matrix transposedes2t = des2'; matchTable = zeros(1,size(des1,1));cnt=0; %no. of matches%ration of ditancesdistRatio = 0....Read more

computer vision - Navigation of maze with a group (cluster) of robots

I am thinking about starting a new project with the purpose of mapping and navigating a maze with a cluster of robots. The number of robots I was thinking about are 2 or 3. The following assumptions are made : The robots are fitted with a camera each to help detect the walls of the mazeThe size and shape of the maze is unknown and can be changed according to willThe way the robots should work is that they should communicate and efficiently divide the task of mapping and navigation among themselves.I am studying Electrical Engineering and have n...Read more

kinect - What main factors/features explain the high price of most industrial computer vision hardware?

I am a student who is currently working on a computer science project that will require soon computer vision and more specifically stereoscopy (for depth detection). I am now looking for a great camera to do the job and I found several interesting options:1- A custom built set of two cheap cameras (i.e. webcam);2- The old, classic, economic and proven Kinect;3- Specialized stereo sensors.I found a couple months ago this sensor: https://duo3d.com/product/duo-mini-lv1I tought it was interesting because it is small, stereoscopic and brand new (enc...Read more

computer vision - Hardware to use for 3D Depth Perception

I am planning on giving a prebuilt robot 3D vision by integrating a 3D depth sensor such as a Kinect or Asus Xtion Pro. These are the only two that I have been able to find yet I would imagine that a lot more are being built or already exist.Does anyone have any recommendations for hardware that I can use or which of these two is better for an open source project with integration to ROS (robot operating system)....Read more

computer vision - Issues while warping image using Homography mat constructed using matches given by BF Matcher

We are trying continuosly process the image frames captured by the two cameras, process every two frames and then stitch them to get a complete view. In order to do so we have 1.extracted surf features.2.Got the matches between the two images using Flann Matcher.3.Computed the homography matrix using these matches.4.Applied warpPerspective on the right image.//To get the surf keypoints and descriptors:cuda::SURD_CUDA surf(700);surf(leftImgGpu, cuda::GpuMat(), keyPointsGpuA, descriptorsAGpu);surf(rightImgGpu, cuda::GpuMat(), keyPointsGpuB, descr...Read more

computer vision - Visualize the learned filter of each CNN layer

anyone please tell me how to visualize the learned filter of each CNN layer?The following answers tell me how to only visualize the learned filters of the first CNN layer, but could not visulize the other CNN layers. 1) You can just recover the filters and use Matlab's functions to display them as images. For example after loading a pretrained net from http://www.vlfeat.org/matconvnet/pretrained/ :imshow( net.layers{1}.filters(:, :, 3, 1), [] ) ;2) You may find the VLFeat function vl_imarraysc useful to display several filters. http://www.vlfea...Read more

computer vision - caffe multi-label training with lmdb to classifiy facial regions

I'm using two lmdb inputs for identifying eyes, nosetip and mouth regions of a face. The data lmdb is of dimension Nx3xHxW while the label lmdb is of dimension Nx1xH/4xW/4. The label image is created by masking regions using numbers 1-4 on an opencv Mat that was initialized to be all 0s (so in total there are 5 labels with 0 being the background label). I scaled down the label image to be 1/4 in width and height of the corresponding image because I have 2 pooling layers in my net. This downscaling ensures the label image dimension will match th...Read more