Fog of Code

Random thoughts about code, software and life

Running OpenCV on OSX with OpenCL support

leave a comment »

It all started when I wanted to test how a simple algorithm implemented using OpenCV can be optimized using GPU. I chose the simplest path and decided to replace some of the OpenCV functions I was using with their GPU accelerated counterparts. Lets use matrix multiplication as an example. I couldn’t use the gpu (CUDA) module as it is intended for NVIDIA GPUs, while I use MacBook Pro Retina that runs an Intel GPU. Luckily I found the ocl (OpenCL) module. The only problem was that OpenCV doesn’t fully support OpenCL on OSX.

Step #1: Build AMDBLAS and AMDFFT

OSX Mavericks comes with OpenCL libraries pre-installed so no additional installation are required. When building OpenCV from source using CMake you should notice:

Use OpenCL: YES

This means that OpenCV will be built against OpenCL libraries. There is however another important point to notice:

--   OpenCL:
--     Version:                     static
--     libraries:                   -framework OpenCL
--     Use AMDFFT:                  NO
--     Use AMDBLAS:                 NO

If AMDFFT and/or AMDBLAS are indicated as missing, as should be unless you manually installed them, then the compiled OpenCL enables OpenCV would be incomplete and would issue runtime errors when trying to call linear algebra operations such as ocl::gemm and others. To overcome this limitation we need to install AMDBLAS and AMDFFT from AMD’s clMath website. The only problem is that pre-compiled versions are available only for Windows and Linux. Therefore the clAmdFft and clAmdBlas need to be built from source.

Step #2: Build CMake with AMDBLAS and AMDFFT

After building AMDBLAS and AMDFFT we need to tell CMake to include these libraries when building OpenCV. The easiest way I found to do that is by specifying:



And of course doing the same for AMDFFT. This time CMake would print:

--   OpenCL:
--     Version:                     static
--     libraries:                   -framework OpenCL
--     Use AMDFFT:                  YES
--     Use AMDBLAS:                 YES

From here on I’ll demonstrate only how to use AMDBLAS. AMDFFT should be handled in a similar manner.

Step #3: Load AMDBLASS dynamic library in runtime

Now we should be ready to start coding, but wait… Where would the runtime find the AMDBLAS dynamic library? Diving into the code reveals in modules/core/src/opencl/runtime/opencl_clamdblas.cpp:

 static void* openclamdblas_check_fn(int ID)
     void* func = CV_CL_GET_PROC_ADDRESS(e->fnName);

Calling this line in runtime will try to find the specific AMDBLAS function in the default OpenCL library. Which is in our case (OSX): /System/Library/Frameworks/OpenCL.framework/Versions/Current/OpenCL. But as we already know it doesn’t come with AMDBLAS bundled. To overcome this we need to change the function that looks for the AMDBLAS library:

#include <dlfcn.h>
static void* AppleCLGetProcAddress2(const char* name)
    static bool initialized = false;
    static void* handle = NULL;
    if (!handle)
            initialized = true;
            const char* path = "/Users/me/Work/temp/clBLAS/src/library/libclBLAS.dylib";

            handle = dlopen(path, RTLD_LAZY | RTLD_GLOBAL);
            if (handle == NULL)
                fprintf(stderr, "WTF?!");
        if (!handle)
            return NULL;
    return dlsym(handle, name);

 static void* openclamdblas_check_fn(int ID)
     void* func = AppleCLGetProcAddress2(e->fnName);

This is an ugly hack but it works for test purposes, a better solution needs to be integrated in the official version.

Step #4: Almost there

Trying to run some code now will reveal another error, the AMDBLAS functions are still not found. This is due some name/versioning issues of the AMDBLASS functions:

"clAmdBlasSetup" -> "clblasSetup" 
"clAmdBlasSgemmEx" -> "clblasSgemm" 
"clAmdBlasTeardown" -> "clblasTeardown" 

The quick and dirty fix would be to edit /modules/core/src/opencl/runtime/autogenerated/opencl_clamdblas_impl.hpp for every function used:

-static const struct DynamicFnEntry clAmdBlasSetup_definition = { "clAmdBlasSetup", (void**)&clAmdBlasSetup};
+static const struct DynamicFnEntry clAmdBlasSetup_definition = { "clblasSetup", (void**)&clAmdBlasSetup};

I didn’t investigate why the naming conventions are different.

Sample code

This code should work now and show the difference between a naive nearest neighbor and OpenCL accelerated one (C++11). This isn’t the simplest example but that’s what I had:

#include <iostream>

#include <opencv2/ocl.hpp>

using namespace cv;

struct OCL {
  typedef ocl::oclMat MatType;
  static void mult(const MatType& A, const MatType&B, MatType& res) {
    //ocl::gemm(A, B.t(), 1.0, MatType(), 0, res, 0);
    ocl::BruteForceMatcher_OCL_base bf;
    bf.distType = ocl::BruteForceMatcher_OCL_base::L2Dist;
    std::vector<cv::DMatch> matches;
    bf.match(A, B, matches);

struct OCV {
  typedef cv::Mat MatType;
  static void mult(const MatType& A, const MatType& B, MatType& res) {
    res = A * B.t();

template <typename T, typename InterMat = typename T::MatType>
double test(bool isVerbose)
  typedef typename T::MatType MatType;
  int N = 3000;
  int D = 128;
  cv::Mat mat1(N, D, CV_32FC1); cv::randu(mat1, cv::Scalar(0), cv::Scalar(10));
  cv::Mat mat2(N, D, CV_32FC1); cv::randu(mat2, cv::Scalar(0), cv::Scalar(10));
  MatType res(N, N, CV_32FC1);
  auto imat1 = InterMat(mat1);
  auto imat2 = InterMat(mat2);
  auto t1 = getTickCount();
  T::mult(MatType(imat1), MatType(imat2), res);
  auto t2 = getTickCount();
  auto d = ((double)t2 - t1)/getTickFrequency(); 
  if (isVerbose) {
    std::cout<< d<< "[s]"<< std::endl;
    //std::cout<< T::name<< d << std::endl;

  return d;
int main()
  ocl::DevicesInfo devices;

  auto isVerbose = true;
  double sum = 0;
  for (auto ii = 0 ; ii < 100 ; ++ii)
    auto d1 = test<OCL, cv::Mat>(isVerbose);
    auto d2 = test<OCV, cv::Mat>(isVerbose);

    double change = d2/d1;
    sum +=  change;
    std::cout<< "Average ["<< ii+1<< "]: "<< round(sum/(ii+1)) << std::endl;

  return 0;


Taking all these steps should help you run some FFT and linear algebra code using OpenCV+OpenCL on OSX. You can find the changes I listed in this post summarized here

Written by xyand

February 23, 2014 at 7:57 am

Posted in OpenCV

Tagged with , , , ,

Staying out of the spam folder

leave a comment »

Recently I noticed that some legitimate emails started finding their way to my Gmail spam folder. I didn’t consider it  an issue until I started noticing that some of the emails I sent were being ignored. Luckily I had other ways to contact the recipients so I asked them to have a look at their spam folder. Surprise surprise, there they were: my emails!

Now, it’s one thing when personal email are being ignored, but a totally different thing when business opportunities get lost.

My Setup

I used Zoho as my company email provider but was sending most of my emails using my personal Gmail account by adding the Zoho account and using “Send mail as“. I was sending 5-10 emails a day, mostly personal but many had similar content . No bulk sending.

Finding #1: “Send mail as” includes the original email address

Return-Path: <>
Received: from 
        ( [2a00:1450:400c:c00::22e])
        by with ESMTPS id q13si22316841wjr.20.2014.
        for <>
        (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
        Mon, 06 Jan 2014 00:01:39 -0800 (PST)
Received-SPF: pass 
   ( domain of 
   designates 2a00:1450:400c:c00::22e as permitted sender) 
   ( domain of 
   designates 2a00:1450:400c:c00::22e as permitted sender);


Although it seems that this email was successfully authenticated it raises two questions:

  • Why should others know my personal email address?
  • Does this “send for” help spam filters to mis-detect my emails?

Attempt #1: Not-spam

Hoping that it would make the change I asked my friends to mark all my emails in their spam folder as “not spam”. It didn’t help much as emails kept getting marked as spam.

Attempt #2: Avoid using a proxy

I decided not to take any chances and start sending email directly instead of by proxy. I also moved my mail provider to Google Apps and registered my domain with them.

Some emails still found their way to the spam folder.

Attempt #3: SPF + DKIM

When I started looking for standard means that are used to authenticate email origin, I stumbled upon SPF and DKIM. SPF, Sender Policy Framework, is a mechanism system administrators use to define which hosts can send emails for a specific domain. In my case I want to configure SPF to allow Google Apps servers to send emails from my company domain. DKIM, DomainKeys Identified Email, is a cryptographic mechanism used to associate a domain name with an email message, allowing you to take responsibility for the message and making it hard for other to tamper with the message contents.

Here are the official Google Apps SPF + DKIM instructions


I have no idea why my emails were marked as spam in the first place. I guess Google made some changes to its spam filters as many legitimate emails find their way to the spam folder. I did all I could to be marked as “legit” and from now and on my emails seem find their way to the recipients’ inboxes. I don’t know if it was a spam filter policy change, my attempts to be clean, or some divine intervention, but did the trick.

Written by xyand

January 8, 2014 at 7:11 am

Posted in email

Tagged with , , ,

Meteor photo gallery – memory leaks

leave a comment »

First of all, I want to apologize. Twice. The first is for introducing (at least) two memory leaks in the Meteor photo gallery code I published earlier. The second is for not posting the fix as soon as I discovered it.

Ok, so let’s start.

The first memory leaks is simply a bug that aggregates all preloaded images in an array and never clears them:

    # Function that marks an image as loaded
    load = (id) -> return -> doing -= 1; Session.set("loaded" + id, true); 

Should be removed once loaded:

      load = (id) -> return -> doing -= 1; Session.set("loaded" + id, true); delete loading[id]

The second leak is trickier. It all started when I abused session variables to store global data. I mentioned that it is a hack and should be changed, but missed the memory leak it introduced. Can you find the leak?

  # This triggers template rendering
  for id in photosVisible
    Session.set("visible" + id, true)
  # This triggers template removal
  for id in updatesFalse
    Session.set("visible" + id, false)

The problem is that the session variables never get garbage collected. Being accessed by name there is no way for the GC to figure out that they won’t be used again. So we need to clear them explicitly.

for id in updatesFalse
  Session.set("visible" + id, undefined)
  Session.set("loaded" + id, undefined)

Much better. Now the gallery barely takes any RAM, compared to other galleries around.

Written by xyand

September 29, 2013 at 8:08 am

Posted in javascript, Meteor

Tagged with , , ,

Lightbox ambient background with CSS3

with one comment


Nice? Keep reading. This post is 100% inspired by (sample).

So how is it done?

The recipe is quite simple:
1. Place an image in the center of the page
2. Create a blurred background image and add some noise
3. Place the background image so it covers the entire background (zoom)

I’ll show here a CSS3 implementation (didn’t make it cross browser, sorry) to avoid creating/reading the background image. Let’s start.

I assume that you have two version of the image. One for thumbnails and the other for full screen view. The thumbnail would be a better choice for the background image as it is already loaded.


<div class="lightbox">
    <div class="background"></div>
    <div class="noise"></div>
    <img src=""/>

As you can see we have two placeholders for the background and the noise. Let’s move on to the CSS:


.lightbox {
    position: fixed;
    height: 100%;
    width: 100%;
    background-color: black;

.background {
    top: 0;
    left: 0;
    position: absolute;
    width: 20%;
    height: 20%;

    background: url( no-repeat;
    background-size: cover;
    -webkit-filter: blur(3px) brightness(0.8);
    -webkit-transform-origin: top left;
    -webkit-transform: scale(5);

There are two interesting things to notice here. The first is that we use -webkit-filter: blur() to create the blur effect. The second is the width: 20%; height: 20% and -webkit-transform: scale(5);. If we remove these lines, everything should look the same, with a small exception. The browser would apply the blur filter pixel-wise on the full size image. Try that and you’ll notice a considerable degradation in performance (when used inside a gallery). Applying the filter on a small image and then scaling the result has the same effect but works much much faster.


.noise {
  position: absolute;
  top: 0;
  bottom: 0;
  left: 0;
  right: 0;
  background-image: url(... See fiddle ...ErkJggg==);
  opacity: 0.5;

By overlaying an opaque noise image on top of the background we create the granularity effect.

You can test the final result here fiddle

Written by xyand

August 14, 2013 at 4:38 pm

Posted in css, html, Uncategorized

Tagged with , , , ,

CSS transitions with Meteor

leave a comment »

This is my second attempt to do transitions with Meteor. The first time I was new to Meteor and was going against the stream. Eventually it worked, but the code was so sensitive that any small change broke some existing functionality. Mixing Meteor declarative style with imperative js is not fun at all.
This time I decided to try and do it the Meteor way.

The basics

CSS transitions fit nicely with Meteor as it is declarative by nature. Once you define that a certain element has a transition, it is up to the browser to monitor changes and apply the transition when applicable.

.class {
  -webkit-transition: all 0.5s ease;
  -moz-transition: all 0.5s ease;
  -o-transition: all 0.5s ease;
  -ms-transition: all 0.5s ease;
  transition: all 0.5s ease;

That was the CSS side of the story. Now lets get back to Meteor. Lets keep in mind Meteor is simply a javascript code that does DOM manipulations with a certain timing. For the transitions to work these DOM updates need to be transition-able. There are certain DOM changes that transitions don’t apply to:

  • Changing display (display:none)
  • Removing element from DOM

The first one is pretty easy and is not Meteor specific. Simply use other means to hide elements. The second one is a little more complicated. So when does Meteor remove an element from the DOM?

  • When it re-renders the element
  • When an element is conditionally included

To avoid removing and re-adding an element upon re-render, we need to ask Meteor to preserve that element via Template.templateName.preserve(). It is straight forward for single element selectors and a little more complicated when a selector matches more than one element. In that case we need to add a unique identifier to each element. And use the following form:

    # Needed for transitions
    ".wrapper-compare": (node) -> node.getAttribute("data-id")

That way the element is preserved between re-renders. Keep in mind that any property you added to that element dynamically (jQuery, etc.) won’t be preserved.

Conditional elements

By conditional element I mean something like:

{{#if isHelperTrue}}
   <div id="conditionalDiv">Content...</div>

If you want these elements to animate on show/hide, simply use another mechanism to hide them. One candidate could be style="position:absolute;width:0;height:0". This can be a problematic if you hide MANY elements and clutter the DOM.

This should be it. I’ll try to post a working example soon. Contact me if you want to see it in action before that.

Written by xyand

July 25, 2013 at 3:43 pm

Posted in css, Meteor

Tagged with , , ,

Meteor photo gallery – code time

with one comment

In my last post I wrote about some requirement and limitations that we should take into account when implementing a good image gallery. Now it’s time to dive into the details of developing such gallery in Meteor. The code provided here was written to solve a real need in my application, it wasn’t intended for educational/library purposes so it is less than perfect. Please tell me if you think it can be done better.

I my code I assume that I all image metadata can be loaded from the database before rendering the gallery. It is good enough in my case. If it is a problem with your application then you could simply add pagination logic to load in smaller chunks.

Track the gallery scroll position reactively – as you will see in the following snippets, we need to respond to changes in gallery scroll position. Storing this position in a reactive variable will trigger these responses.

  # Set mouse wheel to scroll horizontally (the gallery is horizontal)
  $("body").mousewheel((event, delta) ->
    $("#galleryScroll")[0].scrollLeft -= delta *

Part #1: Get the data

Scroll position is the most important parameter in this problem. It defines what data should be displayed and which shouldn’t. We also need to know in advance the position of every image to decide if it’s visible or not. In my gallery (design consideration) the position of one image depends on the position of all the images that precede it, so fetching is required. The following reactive autorun calculates the positions of all images in the gallery. It is a reactive context which is invalidated as the data (images) or layout (number of rows, row height, etc.) changes.

# Set location
Session.set("photosSorted", [])

#################### Position photos ####################
  nRows = Session.get("nRows")

  # Spacing between the thumbnails
  spaceThumbY =
  spaceThumbX =

  # Current horizontal in every row
  xCurrentMod = (spaceThumbX for a in [1..nRows])
  yCurrentMod = (spaceThumbY + (2*spaceThumbY + Session.get("heightThumb"))*idx for idx in [0..nRows-1])

  idx = 0
  photosTmp = []
  photosInView().forEach (photo) ->

    # Gallery layout specific logic
    rowCurrent = idx % mod
    xCurrent = xCurrentMod[rowCurrent]
    widthPhoto = widthThumb(photo)
    leftPhoto = xCurrent + spaceThumbX
    rightPhoto = leftPhoto + widthPhoto + spaceThumbX

    # Absolute position of thumbnails in the gallery
    Session.set("left" + photo._id, leftPhoto)
    Session.set("top" + photo._id, yCurrentMod[rowCurrent])

    photosTmp.push({"left": leftPhoto, "right": rightPhoto, "_id": photo._id})
    xCurrentMod[rowCurrent] = rightPhoto
    idx += 1

  # Array of all thumbnails sorted from left to right (the scroll direction)
  Session.set("photosSorted", _.sortBy(photosTmp, (x) -> x.left))

  # The rightmost part of the gallery
  Session.set("scrollLeftEnd", Math.max.apply(this, xCurrentMod))

So what did we do here? We calculated two things. One is the position of every image. It would help us placing the image in the gallery. But more important, it would be the key to loading/pre-loading/unloading images.

Part #2: Load the photos

After we know the location of every image, now is the time to load and display the appropriate images based on the scroll position.

  #################### Load photos ####################
  scrollLeft = 0
  photosVisible = []
  photosToLoad = []

  Meteor.autorun( ->
    # Wehenver the the gallery scroll position changes the gallery
    # contentes need to be recalculated.
    scrollLeftNew = Session.get("galleryScrollLeft")

    widthScroll = Session.get("widthGallery")

    # Control scroll sensitivity - For fluent UI we don't want to
    # respond to every tiny mouse scroll, we are safe with our
    # pre-loaded margins
    diffScroll = Math.abs(scrollLeftNew - scrollLeft)
    underMinScrol = diffScroll < widthScroll/
    if underMinScrol
    scrollLeft = scrollLeftNew

    # This is the margin around the viewport that we use to pre-load images that
    # are not visible yet. `leftScroll` and `rightScroll` are the horizontal
    # limits of the gallery region we are going to populate with images
    margin = widthScroll* 
    leftScroll = scrollLeft - margin     
    rightScroll = leftScroll + widthScroll + 2*margin
    # We use an efficient binary search here to find the range of visible
    # images. We use the reactive sorted array that we calculated earlier
    photosSorted = Session.get("photosSorted")
    posLeft = _.sortedIndex(photosSorted, "left": leftScroll, (x) -> x.left)
    posRight = _.sortedIndex(photosSorted, "left": rightScroll, (x) -> x.left)

    # Ordering the array so the visible are loaded first
    # This is a little trick that re-arranges the array
    # of images to be loaded from:
    #  left-of-viewport -> in-viewport -> right-of-viewport
    # to:
    #  in-viewport -> left-of-viewport -> right-of-viewport
    # It would affect our pre-loading and cause the visible
    # images to be loaded first
    photosVisibleNew = []
    photosVisibleMargin = []
    for photo in photosSorted[posLeft..posRight]
      inLeft = (photo.left - scrollLeft > 0) and (photo.left - scrollLeft < widthScroll)
      inRight = (photo.right - scrollLeft > 0) and (photo.right - scrollLeft < widthScroll)
      if inRight or inLeft

    photosVisibleNew.push(photosVisibleMargin...) # Concat

    # These are the images that were visible and have to be removed
    updatesFalse = _.difference(photosVisible, photosVisibleNew);
    photosVisible = photosVisibleNew

    # These are the images that have to be pre-loaded in order to control
    # the loading order and not leave it to the browser.
    photosToLoad = photosVisible[..] # Copy

    # This triggers template rendering
    for id in photosVisible
      Session.set("visible" + id, true)

    # This triggers template removal
    for id in updatesFalse
      Session.set("visible" + id, false)

The images loading happens in the background. Notice that when the scroll position changes there might be images that were in the loading queue but weren’t loaded yet. Here we simply remove them from the queue, while if we let the browser load it using img tag, we’d have to wait until they are fully loaded before currently visible images could be loaded. This happens when one scroll fast to the end of the gallery. Image pre-loading:

  # The number of images being pre-loaded at any given moment
  doing = 0
  loading = []
      # Function that marks an image as loaded
      load = (id) -> return -> doing -= 1; Session.set("loaded" + id, true);

      # Try to load images until there are no more or we reached our limit of
      # concurrent loads. All images that have not been loaded will be loaded
      # during the next call to this interval call
      while doing < and photosToLoad.length > 0
        # Take photo out of the load queue
        id = photosToLoad.shift()

        # Skip if already loaded
        if not Session.get("loaded" + id)
          doing += 1

          # Trigger loading
          img = new Image
          img.onload = load(id)
          photo = Photos.findOne("_id": id) # This could be eliminated
          img.src = photo.src

We limit the number of concurrent loads as we can’t cancel a load that was already initiated. Having too many concurrent loads hurts responsiveness. Imagine that you scrolled to a position in the gallery and then scrolled again before the images were loaded. To see the images in the new position you would have to wait until all images from the last position are loaded.

Part #3: The gallery

Now that we have all the data ready it is time to render it into a nice gallery:

<template name="groupThumbs">
{{#each photos}}
  <!-- Show the image box even if not visible (aesthetic reasons only) -->
  <div class="group-thumb"
    <div class="group-inner">
      {{#isolate}} <!-- No need to render the entire gallery because one image changed -->
        {{#if isVisible}}
          <img src="{{photoThumb}}" class="{{classLoaded}}">
    heightThumb: -> getHeightThumb()

    widthThumb: ->
      ratio = getHeightThumb() / @photo.height
      return Math.round(@photo.width * ratio)

    left: -> Session.get("left" + @_id)
    top: -> Session.get("top" + @_id)
    isVisible: -> Session.get("visible" + @_id) and Session.get("loaded" + @_id)


If you got this far, you should be able to write your own gallery. One thing I must add about this code. It’s very sensitive, almost every change made will affect performance/user experience.

If you think you can make this code better, then please let me know.

Good luck!

Written by xyand

July 17, 2013 at 8:00 pm

Meteor photo gallery – intro

with one comment

This will be the first post out of a series of two. In this post I would lay out the requirements a good photo library should fulfill and some extra consideration we should pay attention to when writing an image gallery. The next post will go into the details of implementing such a gallery in Meteor. So lets begin.

It all started when I implemented my first image gallery, using Meteor. It was a simple gallery. I got image details from a collection query and rendered it all to the DOM … what could be easier? As the number of photos grew, my naive implementation didn’t cut it, so I had to find a better way to deal with it. But before I tell about my solution, lets consider the requirements and limitations of a large image gallery hosted in a web browser:


  1. Fast loading

    1. Load the gallery fast

    2. Load individual image content fast

  2. Fast scroll.


  1. Memory usage – A large DOM is a bad idea, and a large DOM with many images loaded in memory is a very bad idea.

  2. CPU usage – Making too many event-driven computations can badly hurt the user experience

  3. Image loading order/limits – The browser limits how many images can be loaded at once (1 image = 1 http request). It also controls the order in which images are loaded.


So in order to satisfy these basic usability requirements we would have to:

  1. Load optimized size thumbnails (fast load – images)

  2. Load images that are currently in the viewport first (fast load – gallery)

  3. Avoid loading images that are not near the viewport at any given moment (fast load – gallery)

  4. Unload images far from the viewport (minimize memory/cpu usage)

  5. Cancel scheduled pre-loads which are no longer needed (fast scroll)

  6. Preload images that are likely to enter the viewport soon (fast scroll)

  7. Avoid making computations that affect scroll speed (fast scroll)

  8. Avoid making unnecessary computations that would hurt overall performance (fast scroll)

  9. Optimize mandatory computations (minimize cpu usage)

I may have forgot a few, but these are the main issues that we’ll take care of in the next post. Stay tuned.


Written by xyand

July 16, 2013 at 11:43 am

Posted in javascript, Meteor

Tagged with , , ,


Get every new post delivered to your Inbox.