Back in august of 2017 I needed to use SWT algorithm to detect images that contain text in any form. At that time, there was no efficient enough implementation for my case, so I’ve decided to create Python bindings for libccv’s implementation of SWT. That was over 3 years ago - the solution got deprecated over time, now it’s officialy Python 3 era, with Docker all over the place. As the previous post still recieves some traffic and my GitHub repo is active, I have decided to update my solution to support newest stable libccv, Python 3.9 and Docker.

TL;DR

Stroke Width Transform for Python 3 - how to use a fast SWT implementation from libccv in Python 3.9 and/or Docker.

Prerequisites

Option A - local build:

  • g++ compiler
  • Python 3.9 (it will probably work with 3.6+ too)
  • cmake >= 3.17
  • libjpeg-dev, libpng-dev, libatlas-base-dev, libblas-dev installed
  • basic knowledge of C++
  • pybind11

Option B - docker image:

  • docker

SWT bindings for Python 3.9

Here is the how:

  1. Download and compile libccv in the form of shared library
  2. Implement bindings for the libccv using pybind11
  3. Compile the library and use it

I just want to note, that I’m no C++ developer and I haven’t used pybind11 before, so in my opinion there is no experience with this toolset required.

Compling libccv for use in C++

There is actually not much to be done. I’ve cloned the original libccv repository and changed one flag in the makefile to make it work.

CFLAGS := -O3 -fPIC -ffast-math -Wall $(CFLAGS)

See https://github.com/marrrcin/ccv/commit/5d945a6a02283880b4be4370d4dd6dd229402820

I recommend to just clone my fork from https://github.com/marrrcin/ccv/tree/stable :

git clone --single-branch --branch stable --depth 1 https://github.com/marrrcin/ccv.git /ccv

Before the compilation starts, the following packages need to be installed in the system (it should work both on macOS and Linux):

libjpeg-dev
libpng-dev
libatlas-base-dev
libblas-dev

After the required libraries are installed, I’ve just run the following command from the ./lib directory:

./configure && make libccv.a

The compliation is rather a lengthy process if all of the optimization flags are enabled (which they are, by default). Also, when I run this compliation on macOS I experienced peak memory usage of 7GB, so keep this in mind. If you want to trade off library speed for the decreased compilation time/resources usage, replace the -O3 to either -O2, -O1 or -O0 do decrease / disable the compiler’s optimizations.


Implementing pybind11 bindings for SWT in libccv

I’ve created new cmake-based C++ project for a shared library and I’ve copied libccv.a and ccv.h files from the ./lib directory (of cloned libccv repo) into new folder within my project - also called lib.

The whole bindings for SWT consists of barely 41 lines of C++ code which I explain below.

// Include libccv headers to use stroke width transform

extern "C" {
    #include "lib/ccv.h"
}

// Include pybind11 headers to create Python bindings
#include <pybind11/pybind11.h>

namespace py = pybind11;

// Run SWT on the in-memory image (e.g. PNG or JPEG) and return the detected rectangles
py::list swt(char *bytes, int array_length){
    py::list result; // Initialize plain-Python List
    ccv_enable_default_cache();
    ccv_dense_matrix_t* image = 0;
    ccv_enable_default_cache();
    
    // Read in-memory image
    ccv_read((void*)bytes, &image, CCV_IO_GRAY | CCV_IO_ANY_STREAM, array_length);
    if(image != 0) {
        ccv_array_t *words = ccv_swt_detect_words(image, ccv_swt_default_params);
        if (words) {
            int i;
            for (i = 0; i < words->rnum; i++) {
                ccv_rect_t *rect = (ccv_rect_t *) ccv_array_get(words, i);
                py::dict item; // Initialize plain-Python dictionary and fill it with values
                item["x"] = rect->x;
                item["y"] = rect->y;
                item["width"] = rect->width;
                item["height"] = rect->height;
                result.append(item);
            }
            ccv_array_free(words);
        }
        ccv_matrix_free(image);
    }

    return result;
}

// Define pybind11 module
PYBIND11_MODULE(swt_python3, m) {
    m.doc() = "Python 3 compatible SWT (Stroke-Width-Transform) binding to libccv by Marcin Zabłocki";
    m.def("swt", &swt, "Apply Stroke-Width-Transform on input byte stream");
}

One key difference when compared to my old implementation is that with pybind11 I could use well-known Python structures such as lists and dictionaries directly from C++ code, making the whole “from C++ to Python” data passing a piece of cake 🎉.

In order to compile the bindings into a Python-usable package there are a few additional steps required.

  1. Project needs to have CMakeLists.txt file in order for cmake to know how to compile it. In this file all of the necessary linker and pybind11 configurations are set:

    cmake_minimum_required(VERSION 3.17)
    project(swt_python3)
    
    set(CMAKE_CXX_STANDARD 11)
    set(CMAKE_CXX_FLAGS "-O3")
    set(CMAKE_CXX_FLAGS_DEBUG "-g -O3")
    
    find_library(CCV ccv lib)
    find_package(JPEG REQUIRED)
    find_package(PNG REQUIRED)
    #find_package(PythonLibs) # Either uncomment this or specify `PYTHON_INCLUDE_DIRS` variable during build time
    include_directories(${PYTHON_INCLUDE_DIRS})
    
    add_subdirectory(pybind11)
    pybind11_add_module(swt_python3)
    target_sources(swt_python3 PRIVATE swt_python3.cpp)
    target_link_libraries(swt_python3 PRIVATE ${CCV} ${JPEG_LIBRARIES} ${PNG_LIBRARIES})
    
  2. Project needs to know where to find pybind11. It can be specified as another CMakeLists.txt file in the pybind11 folder within the project:

    include(FetchContent)
    
    FetchContent_Declare(
            pybind11
            URL   "https://github.com/pybind/pybind11/archive/v2.6.1.tar.gz"
    )
    
    FetchContent_MakeAvailable(pybind11)
    

That’s it. I compile the wrapper using the following command:

cmake -DCMAKE_BUILD_TYPE=Release -DPYTHON_INCLUDE_DIRS=/usr/local/include/python3.9 . && make

It creates a file called swt_python3.cpython-39m-darwin.so (on macOS) or swt_python3.cpython-39-x86_64-linux-gnu.so (on Linux) which is Python-ready module to import!

TIP: in order to find PYTHON_INCLUDE_DIRS just run:

python -c "from sysconfig import get_paths as gp; print(gp()['include'])"

Using stroke width transform in Python 3

Compiled binding is ready to be imported directly from Python code, so if you want to use it anywhere, just copy .so file into your project’s directory and call:

from swt_python3 import swt

Invoking SWT on any image is as simple as:

buffer = open("input.jpg", "rb").read()
swt_result: List[dict] = swt(buffer, len(buffer))
for item in swt_result:
    x, y, width, height = [item[key] for key in ("x", "y", "width", "height")]
    print(x, y, width, height)

Thanks to the pybind11 magic, the interface is much more plesant than last time.

Input
Python 3 SWT input image (compressed)
Output
Python 3 SWT output image (compressed)

SWT for Python 3 in Docker

As many modern deployments are Docker-based, I’ve decided to also include pre-built docker image which includes compiled and ready to use SWT wrapper with Python 3.9. The image includes an example.py script with the code that runs SWT on an image and saves the result into another one.

docker pull marrrcin/swt-python3-ccv:20210111
docker run --rm -it --entrypoint bash marrrcin/swt-python3-ccv:20210111

Dockerfile: https://github.com/marrrcin/swt-python/blob/86a7553d1861f4dd7114c01b6ae512090d7ee14f/Dockerfile

Summary

I hope you liked it, please do not hesitate to create an issue or PR on GitHub if you find any issues.

Comments